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ABSTRACT 

It  occasionally  happens  in  economic  analyses  that  the  correctly 
specified  model  contains  variables  for  which  no  observed  data  has  been 
collected.  When  the  data  in  a  linear  regression  model  are  cross- 
sectional  it  is  possible,  under  certain  conditions  on  the  nature  of  the 
variables,  to  estimate  the  independent  effects  of  a  specific  set  of 
explanatory  variables  on  the  dependent  variable.  A  procedure  for  doing 
this  is  presented. 

A  commonly  used  model  of  reenlistment  behavior,  for  which  the  data 
base  is  cross-sectional,  satisfies  the  requisite  conditions.  This 
permits  the  estimation  of  the  independent  effect  of  the  military  wage 
on  reenlistment  rate,  as  an  illustration  of  the  proposed  procedure. 
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I.   INTRODUCTION 

A.  PRELIMINARY 

There  is  currently  some  concern  about  the  enlistment  and  retention  of 
men  to  serve  in  the  armed  forces  in  a  draft-free  environment.   In  defining 
the  problem  to  be  resolved,  a  number  of  studies  (notably  [1])  have  attempt- 
ed to  describe  the  factors  which  affect  enlistment  and  reenlistment 
behavior.  A  large  part  of  this  interest  is  directed  toward  the  determina- 
tion of  a  military  wage  structure  which  will  ensure  that  civilians  will 
enlist,  and  that  servicemen  will  reenlist,  in  sufficient  numbers  to  meet 
service  manpower  requirements.  This  paper  will  concentrate  on  a  part  of 
this  latter  problem.  Specifically,  the  purpose  here  is  to  estimate  the 
elasticity  of  reenlistment  rate  with  respect  to  military  wage  for  first- 
term  reenlistees  in  the  Navy.  Though  studies  of  this  kind  have  already 
been  conducted,  there  are   a  number  of  reasons  for  additional  study.  Among 
them  is  that  a  new  source  of  data  (previously  unused  data  in  the  form  of 
BuPers  Report  ED198A  for  fiscal  years  1964  through  1970)  is  used  here, 
which  is  more  complete  than  that  used  in  prior  studies.  As  a  consequence 
of  the  availability  of  the  new  data,  some  omissions  of  previous  studies 
may  be  corrected.  But,  most  importantly,  a  somewhat  novel  procedure  is 
used  to  estimate  the  parameter  of  interest  in  what  will  later  be  introduced 
as  the  reenlistment  model. 

B.  BACKGROUND;  DESCRIPTION  OF  THE  DATA 

In  the  past,  extensive  reliance  has  been  placed  in  the  technique  of 
gathering  information  about  reenlistment  behavior  by  the  use  of  surveys 
over  potential  reenlistees.  This  technique  depends  on  before-the-fact 
information,  which  is  in  the  form  of  the  stated  intentions  of  men  facing 
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the  decision  to  reenlist.  Typically  these  surveys  seek  to  determine,  by 
means  of  a  question  and  response  approach  to  the  subjects,  the  factors 
which  affect  the  reenlistment  decision,  and  thus  have  value  in  indicating 
the  lines  along  which  quantitative  research  should  be  performed.  That  is, 
they  serve  primarily  to  identify  those  factors  which  should  enter  into  an 
analytic  model  of  reenlistment  behavior.  But  once  such  a  model  is 
constructed,  reliable  quantitative  results  can  only  be  obtained  by  investi- 
gating the  observed  behavior  of  potential  reenlistees.  This  after-the-fact 
information,  the  revealed  reenlistment  behavior,  is  provided  by  the  newly 
available  data  used  in  this  paper. 

Data  extracted  from  BuPers  Report  ED198A  for  use  here  have  the  form  of 
pooled  time  series  and  cross-sectional  information.  In  particular,  the 
numbers  of  men  eligible  to  reenlist  and  the  numbers  of  these  that  do  in 
fact  reenlist  are  provided  for  each  combination  of 

(1)  Pay  grade:  E-l  through  E-9 

(2)  Rate  (a  Navy  skill  or  job  specialty  classification):  BM,  QM,  ST,  TM, 
FT,  MT,  ET,  DS,  AT,  AX,  AQ,  TD,  SM,  RD,  RM,  CT,  AC,  PT,  HM,  DT,  DM,  MU, 
EA,  AG,  PH,  YN,  PN,  DP,  SK,  DK,  JO,  PC,  AK,  AZ,  GM,  MN,  IM,  OM,  EN,  BT, 
EM,  IC,  CM,  AD,  AO,  AB,  AE,  AM,  PR,  LI,  MR,  SF,  DC,  PM,  ML,  CE,  EO,  BU, 
SW,  MT,  CS,  SH,  SD,  MM,  AV,  SP,  BR,  EQ,  CU,  SO,  AW,  AS. 

(3)  Mental  Group:   I,  II,  upper  III,  lower  III,  IV. 

(4)  Fiscal  year  of  reenlistment:  1964  through  1970.  First-term  reenlist- 
ments  only  dre   considered.   (First-term  reenlistments  are  those  of 
servicemen  completing  their  initial  term  of  active  obligated  service.) 
Reenlistments  beyond  the  first  term  are  considerably  less  interesting, 
since  these  advanced-term  reenlistments  typically  involve  personnel  already 
committed  (psychologically)  to  a  Navy  career. 


"Mental  Group,"  a  designation  akin  to  IQ  that  is  applied  to  enlisted 
personnel,  is  determined  by  testing  as  is  intelligence  quotient.  As  such 
it  is  not  likely  to  be  highly  reliable.  Aside  from  the  facility  with 
which  personnel  in  the  higher  mental  groups  may  enter  certain  more  tech- 
nical Rates,  and  the  fact  that  it  may  be  significant  for  an  enlisted  man 
who  wishes  to  become  an  officer  candidate,  there  is  no  special  advantage 
or  disadvantage  accrued  by  designation  as  a  member  of  any  particular  men- 
tal group.  On  the  contrary,  there  is  possibly  even  a  tendency  on  the 
part  of  a  certain  group  of  men  to  score  poorly,  purposely,  in  the  testing. 
This  group  would  consist  of  some  of  the  personnel  of  better  than  average 
education  who  have  enlisted  in  the  Navy,  during  the  past  few  years  of  a 
high  level  of  military  activity  in  Vietnam,  to  fulfill  military  service 
obligation  and  to  avoid  more  hazardous  duties.  It  is  likely  that  some 
part  of  this  group,  in  merely  wishing  to  serve  their  required  time  in  the 
armed  forces,  would  seek  to  escape  prominence  in  their  enlisted  service. 
There  is,  as  a  consequence,  seemingly  little  general  incentive  to  score 
well  in  Mental  Group  testing.  In  addition,  testing  for  Mental  Group  clas- 
sification is  subject  to  the  same  criticisms  that  have  recently  been 
directed  at  classical  IQ  testing:  some  minority  groups  may  be  put  at  a 
disadvantage  by  the  biased  (toward  comprehensibil ity  by  white  mid-Americans) 
nature  of  the  test.  In  any  case,  classification  by  Mental  Group  is  cer- 
tainly less  reliable  than  cross-sectional  classification  by  pay  grade  or 
Rate,  or  time  series  classification  by  fiscal  year  of  reenlistment.  As  a 
consequence,  the  Mental  Group  classification  will  not  be  of  primary  interest 
here. 

Certain  of  the  Rates  included  in  the  above  report  are  unsuitable  for 
inclusion  in  the  analysis.  Those  Rates  that  are  discarded  from  the  data 
base  are  AV,  SP,  BR,  EQ,  CU,  SO,  AW,  AS,  MT,  DS  and  SD.  Any  Rate  not 
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included  in  the  study  was  disallowed  for  one  of  the  following  reasons: 

1.  The  Rate  consisted  of  pay  grades  E-7  through  E-9  only; 

2.  The  Rate's  membership  consisted  in  large  part  of  foreign  nationals 
who  could  be  expected  to  reenlist  with  high  probability; 

3.  Data  for  the  Rate  were  not  available  for  each  of  the  fiscal  years 
1964  through  1970. 

The  fact  that  the  data  consists  of  a  time  series  of  cross-sections  of 
revealed  reenlistment  behavior  allows  the  correction  of  an  omission  of 
previous  research.  To  date  little  effort  has  been  made  to  establish  a 
relationship  between  the  variation  over  time  of  reenlistment  behavior  and 
the  variation  over  time  of  pecuniary  considerations  facing  the  potential 
reenlistee.  The  time  series  of  cross-sectional  data  provides  a  basis  on 
which  such  a  relationship  can  be  constructed.  The  term  "constructed"  is 
used  advisedly,  since  the  pecuniary  factors  considered  here  are  those 
imbedded  in  a  particular  model  of  reenlistment  behavior. 

Another  disadvantage  of  previous  research  has  been  that  pecuniary 
factors  for  potential  reenlistees  have  only  been  considered  in  coarse  de- 
tail. The  minuteness  of  the  new  cross-sectional  data,  on  the  other  hand, 
permits  a  more  precise  formulation  of  the  economic  factors  that  face  the 
individual  potential  reenlistee.  These  factors  vary  from  man  to  man;  they 
are  dependent  on  the  individual's  level  of  proficiency  (pay  grade),  job 
specialty  (Rate),  and  fiscal  year  in  which  the  reenlistment  decision  is 
made. 
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II.  THEORY  UNDERLYING  THE  REENLISTMENT  MODEL 

A.  FOUNDATION 

The  aim  in  this  paper  is  to  determine  the  rate  of  change  of  first- 
term  Navy  reenlistments  with  respect  to  the  rate  of  change  in  military 
compensation.  Toward  this  end  a  model  is  presented  to  describe 
reenlistment  behavior,  quantitatively  represented  by  reenlistment  rate, 
in  terms  of  those  variables  which  affect  the  reenlistment  decision. 
Then,  using  the  model  as  a  basis  the  pure  effect  of  the  military  wage 
on  reenlistment  rate  is  determined.  Necessarily,  the  influence  of  all 
other  variables  must  be  removed  in  order  to  estimate  the  independent 
effect  of  the  military  wage. 

B.  TASTE  AND  OPPORTUNITY  FACTORS. 

Consider  an  individual  who  is  eligible  to  reenlist.  The  variables 
which  affect  his  decision  may  be  aggregated  into  three  broad  categories: 
pecuniary,  personal  non-pecuniary  and  general  non-pecuniary.  The  first 
two  of  these  categories  are  of  interest  in  this  section  (the  final 
category  is  discussed  later).  Within  the  first  category  are  all 
factors  which  reflect  opportunity  (monetary)  considerations.  It 
includes  such  variables  as  expected  basic  military  wage,  benefits  to 
servicemen  which  may  be  expressed  equivalently  in  monetary  terms,  and 
the  alternative  civilian  wage.  Elements  in  the  personal  non-pecuniary 
class  include  such  factors  as  military  job  satisfaction,  agreeability 
with  the  quality  of  home  life  offered  by  Navy  service,  adaptability 
to  the  military  hierarchy,  and  attitude  towards  sea  or  shipboard 
duty.  Variables  which  are  described  as  non-pecuniary  are  difficult  to 


12 


quantify.  However,  by  employing  the  concept  of  reservation  wage  (for 
a  more  complete  discussion,  see,  for  example,  Gray  [2]),  the  effect  of 
these  purely  individual  non-pecuniary  factors  on  the  reenlistment  deci- 
sion can  be  incorporated  in  a  variable  with  analytic  expression.  The 
qualifying  phrase  "purely  individual"  is  to  be  stressed.  Just  as 
factors  which  affect  the  reenlistment  decision  and  which  are  unique  to 
each  individual  can  be  identified,  so  can  be  recognized  non-pecuniary 
factors  affecting  the  reenlistment  decision  which  are  unique  to  each 
Rate,  or  to  each  pay  grade,  or  to  each  year.  Variables  of  this  sort 
are  the  general  non-pecuniary  factors  and  will  be  introduced  and 
treated  later.  This  is  accomplished  by  considering  the  pecuniary 
compensation  that  will  just  induce  an  individual  to  reenlist.  The 
variables  in  the  class  of  personal  non-pecuniary  factors  can  be  viewed 
as  elements  which  contribute  to  the  determination  of  the  value  of 
compensation  required  to  induce  reenlistment.  Knowledge  of  this  level 
of  compensation  for  an  individual  makes  knowledge  of  the  personal  non- 
pecuniary  factors  affecting  his  reenlistment  behavior  redundant  (at 
least  in  a  study  where  interest  centers  on  macroscopic  reenlistment 
behavior).  As  a  consequence,  the  personel  non-pecuniary  variables 
need  not  be  explicitly  considered  since  they  are  imbedded  into  the 
individual's  reservation  wage,  which  will  now  be  defined.  Suppose 
that  an  individual  deliberating  reenlistment  is  capable  of  estimating 
the  expected  present  value  of  his  alternative  courses  of  action:  to 


This  is  an  advantage  of  the  use  of  data  describing  revealed  reenlist- 
ment behavior:  and  individual's  personal  non-pecuniary  attitudes  are 
inconsequential;  the  fact  of  his  reenlistment  displays  that  any 
personal  dislikes  of  the  service  were  overcome  by  sufficient 
compensation. 
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reenlist  or  not  to  reenlist.  Let  WM  represent  the  present  value  of  all 
pecuniary  returns  if  his  choice  is  to  reenlist,  and  let  WC  represent 
the  present  value  of  all  pecuniary  returns  if  he  chooses  not  to  reenlist. 
WM  consists  of  two  types  of  pecuniary  returns.  Most  obviously  there  are 
those  whose  dollar  value  is  fixed  and  is  not  subject  to  individual 
interpretation:  basic  pay,  variable  reenlistment  bonus,  basic  allow- 
ance for  subsistence,  clothing  allowance.  There  are  also  pecuniary 
returns  whose  dollar  value  is  in  large  part  subjectively  determined  by 
the  individual:  free  medical  services  for  the  serviceman  and  his 
dependents,  Navy  exchange  and  commissary  privileges  and  others.  This 
distinction  is  not  negligible,  and  will  be  treated  explicitly  later. 
For  a  serviceman  on  active  duty,  the  determination  of  WC  is  not  as 
straightforward  as  that  of  WM.  Typically  the  serviceman  may  have  little 
more  than  a  rough  estimate,  in  the  year  in  which  the  reenlistment 
decision  is  made,  of  the  mean  wage  received  by  civilians  working  in  a 
job  category  similar  to  that  of  the  serviceman  and  located  in  the  geo- 
graphical area  of  interest  to  him.  Now  define  r^  as  the  relative  wage. 
Then  the  reservation  relative  wage  is  defined  as  the  value  of  the  above 
ratio  which  will  just  induce  the  serviceman  to  reenlist.  The  individual 
will  reenlist  if  his  actual  relative  wage  is  greater  than  or  equal  to 
his  reservation  relative  wage.  Similarly,  among  the  entire  cohort  of 
eligible  reenlistees,  those  that  reenlist  will  be  those  whose  actual 
relative  wage  is  greater  than  or  equal  to  their  reservation  relative 
wage.  Now  consider  the  domain  of  possible  values  of  reservation  rela- 
tive wage.  For  each  number  in  this  domain,  some  portion  of  the  eligible 
population  will  reenlist.  As  a  consequence,  the  reenlistment  rate 
(over  the  eligible  population)  has  some  functional  expression  over  the 
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domain  of  reservation  relative  wage.  This  introduces  a  variable  of 
fundamental  importance  in  constructing  an  analytic  expression  for 
reenlistment  rate. 

The  form  of  the  functional  dependence  will  be  discussed  later.  It 
is  worth  noting  here  than  an  individual's  reservation  relative  wage  is 
some  fixed  value  of  the  ratio  rr~.     Presumably,  an  individual  consider- 
ing reenlistment  is  able  to  estimate  the  expected  present  value  of 
pecuniary  returns  for  not  reenlisting,  so  his  reservation  relative  wage 
can  be  equivalently  expressed  as  the  ratio  of  a  sufficiently  large  value 
of  expected  present  value  of  returns  for  reenlisting  to  his  estimate  of 
returns  for  not  reenlisting.  This  says  of  course  that  for  each 
individual  the  reservation  wage  uniquely  determines  a  value  of  WM 
sufficiently  large  to  induce  reenlistment.  As  a  consequence  reenlist- 
ment rate,  for  fixed  WC,  has  a  functional  representation  over  the 
domain  of  WM:  for  each  value  of  WM  a  certain  fraction  of  the  eligible 
population  with  given  WC  will  reenlist.  The  implications  of  these 
obvious  comments  are  meant  as  a  preliminary  to  later  work.  In  order  to 
assure  proper  statistical  control  of  the  variables  in  the  model,  it  is 
necessary  to  be  able  to  match  observations  of  reenlistment  rate  with 
corresponding  relative  wage.  That  is,  a  particular  set  of  men  eligible 
to  reenlist  faces  a  given  relative  wage  (the  members  of  this  set  who 
reenlist  in  the  face  of  this  relative  wage  are  those  for  whom  this 
relative  wage  is  the  reservation  relative  wage).  This  set  of  men 
eligible  to  reenlist  must  be  identifiable,  for  each  observed  relative 
wage,  in  order  to  be  able  to  perform  significant  statistical  analysis. 
By  the  preceeding  remarks,  an  equivalent  necessary  condition  for  proper 
statistical  control  is  that  for  any  fixed  value  of  WC  it  is  possible  to 
identify  the  set  of  men  eligible  to  reenlist  which  corresponds  to  any  value 
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of  WM.  Or,  for  any  value  of  WC  and  any  value  of  WM,  it  is  necessary  to 
be  able  to  identify  the  appropriate  corresponding  eligible  population. 
Now  just  as  the  purpose  of  this  section  was  to  eliminate  the  necessity 
of  identifying,  and  including  in  the  model,  variables  which  are  in  the 
class  of  personal  non-pecuniary  factors,  a  purpose  of  later  section 
will  be  to  remove  the  requirement  that  the  value  of  WC  for  a  potential 
reenlistee  be  known.  What  will  in  effect  be  accomplished  is  that  the 
variable  WC  will  be  removed  from  the  model,  so  that  a  correspondence 
between  reenlistment  rate  and  WM  only  need  be  made  in  order  to  satisfy 
the  functional  requirement  that  reenlistment  rate  depends  on  relative 
wage  and  the  statistical  requirement  that  the  appropriate  eligible 
population  be  identifiable  for  given  WM  and  WC. 

C.  THE  REENLISTMENT  MODEL  IN  CROSS-SECTION  AND  TIME  SERIES;  OTHER 
FACTORS  AFFECTING  REENLISTMENT  RATE 

In  the  preceeding  section,  a  model  of  the  form  R  =  f(WM/WC)  was 

postulated,  where  WM  and  WC  are  as  previously  defined  and  R  represents 

reenlistment  rate.  Fisher  [3]  and  [4]  first  concluded  that  a  model  of 

the  form  R  =  f(ln  (WM/WC))  was  indicated.  Specifically,  Fisher  concluded 

that  the  appropriate  model  was  expressed  by: 

R  =  a  +  0  In  (WM/WC)  +  e, 

a  linear  expression  for  R  in  In (WM/WC),  with  disturbance  term  e.  Later 

work,  for  example  Nelson  [5],  employed  a  relation  of  the  form: 

(a)        InR  =  a  +  6  ln(WM/WC)  +  Z  +  e, 

where  the  term  7.  represents  an  additional  set  of  variables  which  are 

included  in  the  model.  The  variables  in  Z  depend,  of  course,  on  the 
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author  of  the  study  employing  the  model.  A  similar  model  in  Logit  form, 

(b)        In  (yM  =  a  +  3  ln(WM/WC)  +  Z  +  e, 

has  also  been  considered  by,  for  example,  Gray  [2]  and  Wilburn  [6]. 

In  this  paper  models  of  both  forms  (a)  and  (b)  will  be  considered 
for  comparative  purposes.  Note  that  equations  (a)  and  (b)  may  be 
rewritten  as: 
(a')       InR  =  a  +  3  InWM  -  B  InWC  +  Z  +  e, 

(b1)       In  (j^\  =   a  +  3  InWM  -  3  InWC  +  Z  +  e. 
Or: 

(•">  R  ■  ■'  (wf1,  e'  - 

(b  )       PR  "  a  \WC /  Z  e  ' 

where: 

a'  =  exp(a),  V     =  exp(Z),  and  e'  =  exp(e). 
These  equations  imply  that,  depending  on  which  of  the  models  (a)  or 

n 

(b)  is  used,  either  In  R  or  ln(y^)  is  linear  in  the  natural  log  of 
the  ration  WM/WC  (neglecting  for  the  moment  the  effect  of  the  variables 
in  Z).  The  implicit  assumption  is  made,  then,  that  the  potential 
reenlistee  values  the  dollars  in  WM  and  in  WC  in  constant  ratio.  That 
is,  the  potential  reenlistee  is  indifferent  to  an  equal  percentage 
change  in  WM  and  in  WC:  his  reenlistment  decision  remains  the  same 
whether  the  relative  wage  offered  him  is  the  ratio  WMi/WC-,,  or  the 
ration  (1  +  a)WM,/(l  +  a)WC-,,  for  any  a  (a  may  be  positive,  negative 
or  zero,  repreenting  an  increase,  decrease  or  lack  of  change 


2 

Note  that  just  as  reenlistment  rate  R  can  be  considered  to  be  the 

sample  estimate  of  the  probability  of  reenl isting,the  ratio 
R/(l-R)  may  be  interpreted  as  the  sample  estimate  of  the  odds  of 
reenl is  ting. 
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respectively  in  each  of  WM-,  and  WC-,).  This  may  not  actually  reflect  the 
candidate  reenl is  tee's  utility  of  dollars  in  WM  and  WC.  The  man  may  in 
fact  value  a  percentage  increase  in  his  civilian  alternative  wage  WC 
more  highly  (or  even  less  than)  the  same  percentage  increase  in  WM. 

To  relieve  this  possibly  erroneous  assumption,  the  following 
revisions  to  models  (a)  and  (b)  will  be  used: 

\wc6/ 
(«•)  A- WJSLf  z-  -■ 


1_R  \WCV 

The  parameter  5  reflects  the  possibility  that  a  potential  reenl istee 
values  a  percentage  change  in  WM  and  the  same  percentage  change  in  WC 
differently.  Presumably,  the  value  of  6  is  positive.  If  this  is  the 
case,  then:  if  5  >  1  a  percentage  change  in  WC  is  valued  more  highly 
than  the  same  percentage  change  in  WM;  if  6  =  1  equations  (c)  and  (d) 
become  (a)  and  (b);  if  0  <  6  <  1  a  percentage  change  in  WM  is  valued 
more  highly  than  the  same  percentage  change  in  WC;  if  5  =  0  the  deci- 
sion to  reenl ist  is  independent  of  the  candidate  reenl istee's  civilian 
alternative  wage;  a  value  of  6  <  0  indicates  an  aversion  to  civilian 
dollars.  These  equations  may  be  rewritten  as: 
(c1)  InR  =  a  +  B  InWM  +  y  InWC  +  Z  +  e, 

(d')  In  (A)  =  a  +  3  InWM  +  y   InWC  +  Z  +  e, 

wjiere:        Y  =  -3<5. 

If  y  =  -3,  then  the  equations  (c1)  and  (d')  become  (a1)  and  (b1). 

The  coefficient  g  in  the  equations  (c1)  and  (d1)  is  the  parameter 
of  interest.  In  equation  (c'),  3  is  the  military  wage  elasticity  of 
reenl istment  rate  since  application  of  the  partial  differential  operator 
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3  to  (c1),  while  neglecting  the  disturbance  term  e,  yields: 

3(lnR)  =  6  3(lnWM)  +  Y  3(lnWC)  +  3Z  ; 
or 

3R/R  =  3(3WM/WM)  +  y   a(lnWC)  +  3Z. 
Similarly,  in  equation  (d1)  6  represents  the  elasticity  of  the  odds  of 
reenlistment  with  respect  to  military  wage. 

It  is  now  appropriate  to  consider  some  assumptions  about  the  nature 
of  the  cross-section  and  time  series  data.  First,  consider  reenlistment 
behavior  of  cohorts  of  eligible  reenlistees  over  time.  It  seems 
reasonable  to  assume  that  an  individual  deliberating  reenlistment  is 
unaffected  by  the  past  reenlistment  behavior  of  others,  and  that  his 
decision  is  also  unaffected  by  past  values  of  relative  wage.  Stated 
equivalently,  this  assumption  is  that  the  model  contains  no  lagged 
values  of  reenlistment  rate  or  relative  wage.  This  is  a  simplified 
assumption;  it  is  of  course  also  possible  to  postulate  and  use  a 
model  which  contains  lagged  values  of  relative  wage.  Now  consider  the 
effect  of  the  war  in  Vietnam  on  initial  enlistments  or  of  general 
civilian  unemployment  on  reenlistments  in  the  Navy.  These  are  examples 
of  temporal  factors  that  can  be  expected  to  have  a  significant  effect 
on  initial  enlistments  (in  the  first  case)  or  reenlistments  (in  the 
second  case)  in  the  Navy.  It  seems  reasonable,  then,  that  a  variable 
reflecting  such  temporal  factors  should  be  included  in  the  model. 
Similarly,  a  potential  reenlistee  who  is  a  member  of  a  certain  Rate  and 
is  in  a  certain  pay  grade  may  be  affected  by  factors  peculiar  to  his 
Rate  and  pay  grade,  as  well  as  to  factors  unique  to  the  year  in  which 
the  reenlistment  decision  is  made.  In  particular,  since  enlisted  men 
in  higher  pay  grades  typically  enjoy  greater  prestige  and  increased 
personal  liberty  than  men  in  the  lower  pay  grades,  it  may  be  hypothesized 
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that  pay  grade  affects  reenl istment  rate  in  ways  not  expressible  in 
terms  of  pecuniary  compensation,  as  well  as  in  its  contribution  to  WM. 
It  cannot,  then,  be  fairly  assumed  that  factors  which  depend  on  Rate, 
pay  grade  or  year  of  eligibility  to  reenl ist  do  not  separately  influ- 
ence the  reenl istment  decision.  As  a  consequence,  variables  represent- 
ing the  influence  of  such  factors  will  be  included  in  the  model.  [Such 
variables  are,  in  general,  unobservable  or  not  quantifiable.  Their 
inclusion  in  the  model  is  a  formalism  for  the  sake  of  completeness.] 
These  factors  are  the  general  non-pecuniary  factors  whose  existence  was 
previously  hypothesized. 

Note  that  nothing  has  yet  been  said  about  the  influence  of  Mental 
Group  on  the  reenl istment  decision.  It  seems  likely  that  personnel  in 
different  Mental  Groups  will  reenl  ist  at  different  rates.  But  designa- 
tion of  an  individual  as  a  member  of  a  particular  Mental  Group  is  some- 
what less  accurate,  hence  less  meaningful  for  statistical  purposes, 
distinction  than  classification  of  personnel  by  Rate,  pay  grade  or  year 
of  reenl istment.  Additionally  WM  for  a  candidate  reenl istee  does  not 
depend  on  his  Mental  Group.  [An  individual's  expected  WC  may,  however, 
depend  on  his  Mental  Group.  If  this  is  the  case,  it  should  emerge  in 
comparison  of  results  for  separate  Mental  Groups.]  Hence,  Mental  Group 
classification  will  not  be  used  to  define  any  of  the  variables  of  the 
model.  Instead,  the  model  to  be  constructed  will  be  applied  to  all 
personnel  in  each  of  the  Mental  Groups  separately.  The  results  for  the 
Mental  Groups  will  then  be  statistically  compared. 

Now  consider  a  potential  reenl istee  viewing  his  military  and  civilian 
pecuniary  alternatives.  WM  depends  (in  a  manner  to  be  made  explicit 
later)  on  his  Rate  and  pay  grade  and  on  the  year  in  which  his  current 
enlistment  expires.  But  typically  the  potential  reenlistee's  view  of 
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his  civilian  alternatives  is  limited;  he  has  been  efficiently  isolated 
from  the  civilian  world  and  civilian  labor  market  by  the  requirements 
of  his  military  service.  And,  typically,  it  is  likely  that  he  has 
been  unable  to  go  job-seeking  in  the  geographical  area  of  interest  to 
him  for  civilian  life.  So  it  may  be  realistic  to  suppose  that  the 
alternative  civilian  wage  perceived  by  the  potential  reenlistee  can  be 
considered  to  be  the  median  wage  (or  average  wage)  of  the  civilian 
population  working  in  his  skill  category  (craftsman,  mechanical,  elec- 
trical, clerical  and  so  on)  in  the  year  in  which  he  is  eligible  to 
reenlist.  This  will  be  taken  as  a  formal  assumption:  the  civilian 
alternative  wage  perceived  by  an  individual  in  a  given  Mental  Group 
depends  only  upon  his  Rate  and  the  year  in  which  the  reenlistment 
decision  is  made.  [This  assumption  may  be  faulty  in  that  the  alterna- 
tive civilian  wage  may  also  depend  on  the  potential  reenlistee's 
military  pay  grade.  That  is,  an  advanced  rank  status  in  the  military 
may  promise  higher  pay  in  the  civilian  economy,  since  it  may  be 
interpreted  as  being  equivalent  to  advanced  expertise.] 

Since  the  assumption  has  been  made  that  variables  representing  R, 

WM  and  WC  are  not  lagged  in  the  model,  the  time  series  data  in  R,  WM 

3 
and  WC  may  be  considered  as  another  cross-section.  Make,  for  the  moment  , 

the  stronger  assumption  that  the  model  contains  no  lagged  variables  at 

all.  Then  the  time  series,  represented  by  year  in  which  observations 

are  made,  may  be  considered  as  another  cross-section.  Let  the 


3 

This  assumption  is  made  for  the  sake  of  simplicity  of  representa- 
tion. Later  it  will  be  seen  that  the  assumption  is  not  necessary; 
equivalent  results  are  obtained  if  it  is  not  made.  At  the  same  time 
it  will  be  seen  that  the  analagous  assumption  for  the  variables  R, 
WM  and  WC  may  be  weakened  somewhat:   identical  results  will  be 
achieved  even  if  the  model  contains  lagged  values  of  the  variable 
WC. 
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subscripts   i,  j  and  t  represent  Rate,  pay  grade  and  year  of  reenlistment 
eligibility.     Then  the  equations   (c')  and   (d1)  can  be  represented  in 
cross-section  data  as 
(e)  In  Rijt  =  a  *  B  in  W,Jt  +  y  In  IC,t  +  A,   *  B.  +  Ct  +  «1Jt  , 

<f)  ln  (*%&)  ~~  °  +  "  ^  ^  +  "  ^  WCit  +  "i  +  BJ  +  Ct  +  Ei0t  ' 


where: 


R.  ..  is  observed  reenlistment  rate  for  Rate  i,  pay  grade  j,  year  t; 
i  j  i 

WM.  ...  is  military  wage  for  Rate  i,  page  grade  j,  year  t; 
i  j  t 


WC 


•  t  is  alternative  civilian  wage  for  Rate  ,i  in  year  t; 


The  variables  A.,  B.,  and  C.  represent  all  factors  which  influence 
l   j      l 

reenlistment  in,  respectively,  Rate  i,  pay  grade  j,  or  year  t  uniquely; 
c..f   is  the  disturbance  term  for  the  observation  of  R. ...  A.,  B.,  and 
C,  are  the  variables  whose  introduction  into  the  model  was  promised 
earlier.  Note  that  these  variables  are  invariant  over  subscripts 
not  included  in  their  notational  expression.  For  example,  the  factors 
represented  by  Ct  depend  only  on  the  year  of  reenlistment,  and  are 
invariant  over  Rate  and  pay  grade. 

Note  that  a  crucial  assumption  implicit  in  equations (e)  and  (f) 
is  that  the  variables  R. ..  and  WM  ...  are  the  only  variables  in  the 
model  which  are  not  invariant  over  at  least  one  cross-sectional 
dimension  (for  convenience,  the  set  of  all  Rates  considered  in  the 
analysis  will  be  referred  to  as  a  cross-sectional  "dimension";  similarly 
for  the  set  of  all  years  and  the  set  of  all  pay  grades  considered). 
Later  work  relies  heavily  on  this  assumption. 

The  models  represented  by  equations  (e)  and  (f)  seem  reasonably 
complete  with  the  introduction  of  the  variables  A.,  B.  and  C.  as  "catch- 
all"  categories  to  reflect  all  factors  which  influence  reenlistment 
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depending  on  Rate,  pay  grade  and  year  separately.  But  it  is  clear  that 

the  inclusion  of  these  variables  creates  a  problem:  quantification  of 

A-,  B.  and  C.  is  difficult  if  not  impossible.  Note  that  this  problem  is 
i   j     i» 

indissoluble.  The  influence  of  such  variables  as  C.  and  WC-t  on  the 
decision  of  a  potential  reenlistee  is  almost  certainly  non-trivial. 
Their  effects  cannot  reasonably  be  ignored  in  any  rational  model  of 
first- term  reenlistment  behavior.  One  possible  approach  to  resolving 
this  problem  is  to  construct  a  model  using  dummy  variables  to  represent 
Rate,  pay  grade  and  year.  But  in  the  face  of  61  rates,  nine  pay  grades 
and  seven  years  this  may  yield  results  too  minutely  specialized  to  be 
interesting  unless  a  certain  amount  of  arbitrary  aggregation  (over 
Rates,  pay  grades  and  years)  is  done.  In  any  case,  an  alternative 
procedure  for  ridding  the  models  (e)  and  (f)  of  the  effects  of  the 
variables  A.,  B.  and  C.  will  be  used  here.  Use  of  this  procedure  is 
also  motivated  by  a  desire  to  rid  the  model  of  the  variable  WC.  ,  the 
civilian  alternative  wage,  the  method  of  measurement  of  which  may  be 
subject  to  dispute. 

To  specify  the  procedure,  consider: 
(e)    In  R.jt  =  a  +  b  In  WMijt  +  Y  In  WCn  ♦  A,  ♦  B.  ♦  Ct  +  e.jt  , 

in  "observed"  data. 

Taking  the  mean,  for  Rate  i  and  pay  grade  j  ,  over  all  years: 

(el)    In  R..  =  a  +  6  In  WM,. .  +  y  In  WC.  +  A.  +  B.  +  C  +e •  • 

0         *       '  J  • 


1J. 

ij.   '     i. 

Where, 

for  example, 

Rio. 

-  1  I  R 
T  &   Rijt 

and 

wci 

=  T  j,  wcit  ■ 

t=l 

for  T  =  number  of  years  considered  in  the  data. 
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Taking  the  mean,  for  Rate  i   in  year  t,  over  all   pay  grades: 
(e2)         In  R.   t  =  a  +  B  In  WM.   t  +  Y  ln  WCit  +  Ai   +  B     +  Ct  +  £i   t   * 
Taking  the  mean,  for  pay  grade  j   in  year  t,  over  all   Rates: 
(e3)         ln  R    ..   =  a  +  3  In  WM    ..   +  y  In  WC   .   +  A     +  B.  +  C.   +  e    .. 
Taking  the  mean,  for  year  t,  over  all   Rates  and  pay  grades: 
(e4)         ln  R     t  =  a  +  3  In  WM     t+ylnWCt+A     +B     +Ct+e     t 
Taking  the  mean,  for  pay  grade  j,  over  all   Rates  and  years: 
(e5)         ln  R    .     =  a  +  6  ln  WM    .     +  y  In  WC       +  A     +  B.  +  C     +  e    . 
Taking  the  mean,  for  Rate  i,  over  all   pay  grades    and  years: 
(e6)         ln  R.       =  a  +  3  In  WM.       +  y  ln  WC.     +  A.   +  B     +  C     +  e. 
Taking  the  grand  mean: 
(e7)         ln  R         =a+3lnWM        +  y  In  WC       +A+B+C+e 

Adding  and  subtracting, 

(e)  -   (el)  -   (e2)  -   (e3)  +  (e4)  +  (e5)  +  (e6)  -   (e7) 

yields  the  equation: 

ln  R.  ..  -  In  R.  .  -  ln  R.  .  -  ln  R  ..  +  In  R.   + 
ljt      ij.      l.t      .jt      i.. 

ln  R  .  +  ln  R  ,  -  ln  R 

•J.  .  .  L         ... 

3(1  n  WM...  -  ln  WM.  .  -  ln  WM.  .  -  ln  WM  ..  +  ln  WM.   + 
ljt       ij*       I  • t       •Jt       l  •  • 

ln  WM    .     +  ln  WM     .    -   In  WM       )  + 

.J.  .    .    L.  ... 

£ijt"       £ij.       -       £i.t"       £.jt      +      £i..       +      £.j.       +      £..t"       £...  ' 

A  similar  result  holds  for  the  model  represented  by  equation  (f). 

This  is  the  form  of  the  data  that  will  be  used  in  a  linear  regress- 
ion to  estimate  the  coefficient  3.  For  want  of  more  convenient  termin- 
ology, data  in  the  form  above  will  often  be  referred  to  as  "normalized 
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data",  while  the  initial  values  of  each  In  R. ..  and  In  WM..t  will  be 
called  the  "original  data."  In  addition,  the  procedure  of  obtaining 
normalized  data  from  the  original  data  will  sometimes  be  called  "the 
model"  when  no  ambiguity  is  possible.  Some  features  of  "the  model"  in 
this  sense  are  investigated  in  Section  IV. 

Now  note  that  any  variable  which  has  fewer  than  three  subscripts  in 
its  notational  expression  disappears  from  the  normalized  form  of  the 
data.  A  little  reflection  shows  that  lagged  values  of  any  such  vari- 
able are  also  purged  in  the  normalized  data.  In  particular  this  holds 
for  the  variable  WC...  As  a  consequence,  it  is  only  necessary,  in 
order  to  obtain  the  identical  equation  in  normalized  data,  to  assure 

that  the  model  contains  no  lagged  values  of  R.  ..  and  WM.  .+.. 

3a  ljt      ijt 

The  question  of  the  nature  of  the  normal izeddisturbance  term: 

£...-£..   -£..   -£..+£.    +e.   +  e   .-e 
ljt    ij.    l .t     .jt    l ..     .j.     ..t 

will  be  taken  up  later. 

D.  THE  CONSTRUCTION  OF  WM 

The  measurement  of  WM  used  here  is  that  proposed  by  Burton  C.  Gray 
in  [13]. 

As  mentioned  previously,  pecuniary  compensation  for  reenlisting  can 
be  viewed  as  consisting  of  two  types  of  remuneration:  the  actual  wage 
received  by  the  reenlistee  and  the  value  placed  by  the  reenlistee  on 
the  peripheral  benefits  of  military  service.  A  component  of  the  actual 
wage  received  by  a  reenlistee  that  is  unique  to  first- term  reenlist- 
ments  is  the  Variable  Reenlistment  Bonus  (VRB).  This  bonus  is  a  multiple 
of  the  reenlistee 's  annual  base  pay  (which  in  turn  depends  upon  pay 
grade)  and  varies  from  year  to  year  and  from  Rate  to  Rate  (depending 
on  the  valuation  placed  on  reenlistments  in  a  given  Rate  in  a  given  year) 
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VRB  has  since  fiscal  year  1965  been  the  primary  tool  used  to  selectively 
(by  Rate)  influence  reenl istments.  Prior  to  FY  1965  all  reenlistees 
received  a  reenl istment  bonus  that  was  a  fixed  multiple  of  annual  base 
pay.  Ideally,  one  should  wish  to  evaluate  the  effect  of  VRB  on  first- 
term  reenl istment  behavior.  But  since  the  determination  of  a  single 
parameter  of  interest  is  intended  simply  as  being  illustrative  of  the 
fundamental  goal  of  this  paper,  an  investigation  of  the  consequences 
of  using  normalized  data,  this  is  not  done.  VRB  enters  the  construction 
of  WM  as  merely  another  component. 

Now  consider  the  future  of  a  reenl istee.  He  can  reasonably  expect 
promotion  to  a  higher  pay  grade  within  his  next  term  of  enlistment,  with 
a  concurrent  increase  in  pay.  This  expectation  obviously  influences  the 
reenl istment  decision  (for  it  can  be  supposed  that  fewer  men  would 
reenl ist  without  the  promise  of  probable  advancement  in  rank),  but  in 
a  way  difficult  to  specify.  The  simplifying  assumption  is  made  that 
this  promise  of  increased  future  pay  offsets  the  lesser  valuation  of 
future  dollars.  That  is,  in  considering  the  present  value  of  WM,  the 
potential  reenl istee  employs  a  discount  rate  of  zero. 

A  final  assumption,  due  to  the  nature  of  the  available  data  base, 
is  made.  For  want  of  other  information,  it  is  assumed  that  all 
reenl istments  are  made  for  an  obligation  of  four  years. 

With  the  preceeding  paragraphs  in  mind,  it  is  possible  to  postulate 
the  following  construction: 


WM  =  4C  +  P 


UM  +  4(1  +  K) 


where:    for  a  potential    reenlistee     WM  is  the  present  value  of  military 
wage  for  a  four-year  reenl istment  (at  a  zero  discount  rate),   P  is   the 
reenl  istee1 s  annual   base  pay,  VRB  is  the  appropriate  Variable  Reenl  ist- 
ment Bonus  multiple, 
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C  is  a  constant  representing  the  monetary  valuation  of  the 
peripheral  benefits  of  military  service  for  a  four-year 
reenlistment, 
K  is  a  dimensionless  multiplicative  constant  representing  the 
the  valuation  of  those  benefits  associated  with  military 
service  that  can  be  expected  to  increase  with  annual  base 
pay.  K  is  intended  to  reflect  such  elements  as  tax 
advantages,  allowances  and  commissary  and  exchange  benefits, 
whose  value  increases  as  base  pay  increases. 
This  may  be  rewritten,  for  Rate  i,  pay  grade  j  and  year  t,  as: 

1  +  VRB 


■1^  +  4  (1  +  K) 


WM..t  -  4C+Pijt 


The  construction  of  WM  allows  freedom  for  parameterization  of  the 
constants  C  and  K.  In  order  to  get  an  idea  of  the  sensitivity  of  the 
coefficient  B  to  changes  in  assumed  C  and  K,  regression  analyses  are 
performed  for  various  presumably  reasonable  values  of  these  constants. 
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III.  APPLICATION 

A.   PRELIMINARY 

Consider  the  consequences  of  applying  the  natural  logarithm  trans- 
formation to  the  variables  R.  .f   and  R.  .+/(1-R. ..).  These  variables  have 
respective  ranges  of  values  of  [0,1]  and  [0,°°),  which  under  the  natural 
logarithm  transformation  become  (-  »  ,0]  and  (-°°,°°).  Thus  this  trans- 
formation avoids  the  awkward  situation  of  having  a  finite  range  of 
values  on  the  dependent  variable  (in  the  case  of  R^^)  in  a  linear 
regression  analysis.  But  there  is  a  limitation  associated  with  the  use 
of  the  logarithmic  transformation:  under  this  transformation  a 
reenlistment  rate  of  zero  is  undefined.  Hence  in  the  model  represented 
by  equation  (e)  of  the  preceeding  section,  no  observations  of  zero 
reenlistment  rate  can  be  allowed.  Additionally,  in  the  model  represented 
by  equation  (f),  a  reenlistment  rate  equal  to  one  must  be  disallowed, 
since  this  corresponds  to  an  infinitely  large  value  of  the  odds  of 
reenlistment.  Accordingly,  since  it  is  desirable  to  use  the  same  data 
base  for  each  of  the  models  (e)  and  (f),  any  observations  of  reenlist- 
ment rate  equal  to  zero  or  one  will  be  discarded.  This  is  not  felt  to 
restrict  the  analysis  too  severely  since  reenlistment  rates  of  zero  or 
one,  the  extreme  values  of  the  data,  typically  correspond  to  extra- 
ordinary classes  of  reenlistees.  In  particular,  reenlistment  rates  of 
zero  are  most  common  in  very   low  pay  grades  and  reenlistment  rates  of 
one  are  usually  observed  in  the  highest  pay  grades.  This  suggests  that 
a  zero  reenlistment  rate  can  usually  be  associated  with  a  class  of  men 
who  show  an  unsuitabil ity  for  military  service,  while  a  reenlistment 
rate  equal  to  one  can  usually  be  associated  with  the  class  of  men  who 
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thrive  in  the  military.  Neither  of  these  classes  is  particularly 
interesting  for  a  study  of  general  reenlistment  behavior. 

Now  suppose  that  in  models  (e)  and  (f)  the  error  terms  e...  are 

independent,  identically  distributed  Normal  random  variables,  each  with 

2 
mean  zero  and  variance  a  .  Then  the  application  of  ordinary  least 

squares  procedures  to  estimate  the  coefficient  3  in  the  normalized  form 

of  model  (e), 

In  R...  -  In  R..  -  In  R.  .  -  In  R  ..  +  In  R.   +  In  R  .  + 
1  j t      '  J  •      lit      •  J  ^      1 • •      •  J  • 

In  R  .  -  In  R 

*  ■  U  •  •  • 

S(ln  WM...  -  In  WM.  .  -  In  UN.  .  -  In  WM  ..  +  In  WM.   + 

In  WM  .  +  In  WM  .  -  In  WM   )  + 
•  J  •       . .  x.       ... 

eijt  -  eij.  "  ei.t"  £.jt  +  ei..  +  e.j.  +  e..t  "  e... 

yields  an  unbiased  estimator  for  this  coefficient.  The  same  is  true  for 
ordinary  least  squares  estimation  of  6  in  the  normalized  form  of  model 
(f).  These  assertions  will  be  proved  in  Section  IV,  where  it  will  also 
be  shown  that  the  above  assumption  about  the  distribution  of  the 
disturbance  terms  e...  may  be  relaxed  somewhat. 

B.  VALUES  FOR  PARAMETERIZED  C  AND  K 

Regression  analyses  were  performed  for  each  combination  of  the 
following  selected  values  of  the  constants  C  and  K: 

C  K 

500 

1000  0.10 

1500  0.15 

2000  0.20 
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It  is  felt  that  these  selected  values  represent  a  range  broad  enough  to 
include  realistic  possible  values  of  the  constants. 

C.  THE  REGRESSION  ANALYSES 

In  addition  to  estimating  the  coefficient  3  in  the  normalized  forms 
of  the  models  (e)  and  (f),  it  may  be  interesting  (for  comparative 
purposes)  to  estimate  6  in  the  equations: 


(g)     In  Rijt  -  a  +  B  In  WM^  ♦  «1Jt  , 


l«     ln(l^%)  =  "  +  S  ln  "ijt  +  eijt  • 

where  it  is  assumed  that  the  e_-n«t's  are  independent,  identically 

2 
distributed  Normal  random  variables  with  mean  zero  and  variance  a  . 

Note  that  these  latter  equations  are  truncated  forms  of  the  models 

(e)  and  (f):  the  variables  WC..,  A.,  B.,  C.  are  neglected. 

Four  selections  for  the  value  of  the  constant  C  and  three  choices 
for  the  constant  K  yield  12  different  constructions  of  WM.  Regression 
analyses  are  conducted  for  each  of  these  constructions  of  WM,  using 
models  (e)  (normalized),  (f)  (normalized),  (g)  and  (h)  for  each  of  five 
Mental  Groups.  This  produces  240  least  squares  estimations  to  be 
considered.  Results  for  one  construction  of  WM  for  models  (e)  (normalized), 

(f)  (normalized),  (g)  and  (h)  and  each  of  the  five  Mental  Group  classi- 
fications are  looked  at  in  detail  in  this  section.  Less  detailed 
regression  analysis  results  for  the  remaining  11  constructions  of  WM 
are  given  in  Appendix  A  in  tabular  form. 

Now  consider  Table  I,  which  gives  summary  results  for  the  construc- 
tion of  WM  using  C  =  500  and  K  =  0.10.  Denote  Mental  Groups  I,  II, 
upper  III,  lower  III  and  IV  as  Mental  Groups  1,  2,  3,  4  and  5  respectively. 
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Table  I 


Normal 
Model 

ized 
(e) 

B 

SE 

t 

~2 

a 

R 

N 

MG  1 

1.17260 

0.26011 

4.49983 

0.19601 

0.1904 

720 

MG  2 

1.76626 

0.17863 

9.90073 

0.15014 

0.3070 

1259 

MG  3 

1.84425 

0.21828 

8.44902 

0.17024 

0.2956 

996 

MG  4 

1.34492 

0.20119 

6.68474 

0.15299 

0.2629 

805 

MG  5 

1.50907 

0.28158 

5.35927 

0.13337 

0.2601 

530 

Normal 
Model 

ized 
(f) 

MG  1 

1.87660 

0.36445 

5.14912 

0.38339 

0.2167 

720 

MG  2 

2.72210 

0.24978 

10.89793 

0.29433 

0.3346 

1259 

MG  3 

2.61042 

0.30134 

8.66258 

0.32445 

0.3025 

996 

MG  4 

2.00364 

0.28072 

7.13740 

0.29784 

0.2793 

805 

MG  5 

2.16256 

0.39745 

5.44106 

0.26571 

0.2638 

530 

Model 

(g) 

MG  1 

1.36861 

0.12644 

10.82445 

0.59642 

0.3746 

720 

MG  2 

1.91656 

0.09547 

20.07401 

0.65793 

0.4927 

1259 

MG  3 

1.58111 

0.11230 

14.07977 

0.62178 

0.4078 

996 

MG  4 

1.44961 

0.12798 

11.32667 

0.62849 

0.3712 

805 

MG  5 

1.54090 

0.14984 

10.28386 

0.46204 

0.4085 

530 

Model 

(h) 

MG  1 

1.85354 

0.17451 

10.62108 

1.13624 

0.3685 

720 

MG  2 

2.70828 

0.13598 

19.91696 

1.33460 

0.4898 

1259 

MG  3 

2.05608 

0.15295 

13.44309 

1.15332 

0.3922 

996 

MG  4 

1.93526 

0.17588 

11.00301 

1.18701 

0.3620 

805 

MG  5 

2.11862 

0.21826 

9.70676 

0.98037 

0.3891 

530 
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Let B denote  the  estimate  for  B,  SE  represent  the  standard  error  of  the 

~2 

estimate  of  6,  t  represent  the  computed  t-statistic,  a  be  the  estimate 

2 
of  the  variance  a  ,  R  be  the  multiple  correlation  coefficient  and  N 

represent  the  number  of  observations  of  R.-_-t.  [It  will  be  shown  in 

"2  2 

Section  IV  that  a  is  an  unbiased  estimator  fora  .]  Note  that  the 

computed  values  of  the  t-statistic  indicate  that  in  each  of  the  twenty 
least  squares  estimations  of  6  represented  in  Table  I  the  estimated 
coefficient  is  significantly  different  from  zero.  But  also  note  that  in 
comparing  results  for  the  normalized  models  (e)  and  (f)  and  the  corres- 
ponding truncated  non-normalized  models  (g)  and  (H),  the  following 

differences  are  consistently  true  for  each  Mental  Group: 

1.  The  values  of  computed  t-statistic  for  models  (g)  and  (h) 
are  greater  than  the  values  for  models  (e)  and  (f ) . 

2.  The  standard  error  of  the  esimate  is  less  for  models  (g) 
and  (h)  than  for  models  (e)  and  (f) 

3.  The  multiple  correlation  coefficient  R  is  greater  for 
models  (g)  and  (h)  than  for  models  (e)  and  (f). 

These  considerations  might  seem  to  indicate  that  models  (g)  and  (h) 
fit  the  data  better  than  the  corresponding  normalized  forms  of  models 
(e)  and  (f).  But  in  reality  the  results  1.,  2.,  and  3.  are  not  particul- 
arly surprising,  since  the  computed  value  of  t  is  directly  proportional 
to,  and  the  computed  value  of  SE  inversely  proportional  to,  the  square 

root  of  the  sum  of  squared  deviations  from  the  mean  of  the  explanatory 

2 
variable,  while  1-R  is  inversely  proportional  to  the  sum  of  squared 

deviations  from  the  mean  of  the  dependent  variable.  That  is,  for  a 

single  explanatory  variable  with  observed  values  x.,  i  =  1,  ...n,  and 

a  dependent  variable  with  observed  values  y. ,  i  =  1,  ...n, 


SE  = 


n  1 
2 


T7s2 


(xi 
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t  =  I_ 
1     SE   ' 

and:  I  {y.  -   BX.)2 

R2  -  1  -  V • 

I  (y,  -  y) 
l 

where: 

1  n     -   1  n 
y  =  „  l*V     x  =  n  I     xi  ' 
1  1 

B  is  the  estimated  regression  coefficient,  and  a     is  the  estimate  of 

2 

a  .  Hence  as  the  sum  of  squared  deviations  from  the  mean  of  both  the 

explanatory  variable  and  the  independent  variable  decrease,  it  is  to  be 

2 
anticipated  that  SE  and  R  will  increase  and  the  computed  t-statistic 

will  decrease.  To  see  how  this  fact  yields  the  results  in  comparisons 

1.,  2.,  and  3.  above,  consider  the  explanatory  and  dependent  variables 

of  the  models  (e)  (normalized)  and  (g).  Dropping  for  a  moment  the 

logarithm  symbol,  model  (e)  (normalized)  has  dependent  variable; 

R...  -  R.  .   -  R.  .  -  R  ..  +  R.   +  R  .  +  R   .  -  R 
ljt    lj.    1. 1    .jt    i . .    .j.    ..t 

and  explanatory  variable; 

WM.  ..  -  WM.  •  -  UN.  «.  -  UN  -+  +  WM.   +  WM  .  +  WM  .  -  WM 
l Jt     lj.     l.t     .jt     l . .     .j.     ..t 

both  of  which  have  mean  zero,  while  model  (g)  has  dependent  variable 

R.  .j.  and  explanatory  variable  WM.  ...  Taking  squared  deviations  from 

the  mean  for  the  variable  R-it: 
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ni(R1it-R  )2 - 

1  j   t   1Jl 

?H(Rut-Ru.  -Ri.t-R.jt  +  Ri..  +  R.o.  +  R..t-R...'2  + 

Tl  I  (R-j,  -  R   )2  +  J  II  (Ri  t-R   )2  + 
I  I  I  (R  ^  -  R   )2  +  i  J  I  (R   -  R  J2  + 

I  T  I   (R    -  R   )2  +  J  T  I  (R    -  R.   )2  > 
j   '•"    -J'        i   '•'    1" 

I  I  I   (R*j*  -  R-  ■  -  R4  *.  -  R  4+  +  Rj   +  R  •  +  R  +  -  R   )2  > 

i  j  t       1J*  '     " 

since  all  terms  in  the  above  equation  are  non-negative.  But  the  term 
on  the  right  hand  side  of  this  inequality  is  the  sum  of  squared  devia- 
tions from  the  mean  of  the  dependent  variable  in  the  normalized  form  of 
model  (e).  A  similar  result  holds  in  the  comparison  of  the  sum  of 
squared  deviations  from  the  mean  of  the  explanatory  variables  in 
models  (e)  (normalized)  and  (g).  And  a  similar  result  holds  in  the 
comparison  of  the  models  (f)  (normalized)  and  (h)  as  well.  As  a 
consequence,  the  results  of  comparisons  1.,  2.,  and  3.  are  not  unexpected 

Now  consider  the  estimates  of  6  presented  in  Table  I.  All  estimates 
of  the  military  wage  elasticity  of  the  odds  of  reenlistment  and  the 
probability  of  reenlistment  exceed  one.  In  fact,  the  estimates  of  the 
elasticity  of  R  with  respect  to  WM  cluster  loosly  about  a  value  of  1.5, 

D 

while  the  estimates  of  the  elasticity  of  y- =-  with  respect  to  WM  have  a 
median  value  of  approximately  2.  Since  these  estimates  are  based  on  a 
single  choice  for  the  construction  of  WM  no  great  import  will  be  assigned 
to  them,  except  to  note  that  they  are  not  appreciably  different  from 
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estimates  of  these  quantities  obtained  in  other  studies.  For  example, 
estimates  of  the  WM  elasticity  of  R  in  previous  studies  are  generally 
confined  to  the  range  0.8  to  3,  with  the  bulk  of  the  estimates  lying 
in  a  range  of  values  between  1  and  2.  Note  that  in  the  normalized  forms 
of  models  (e)  and  (f)  the  estimates  of  3  for  Mental  Groups  II  and  upper 
III  seem  to  be  appreciably  higher  than  estimates  of  this  coefficient 
for  Mental  Groups,  I,  lower  III  and  IV  (this  apparent  difference  is 
not  so  marked  for  models  (g)  and  (h);  in  any  case  models  (g)  and  (h) 
are  of  interest  here  only  for  a  comparison  of  results  with  the  corres- 
ponding normalized  forms  of  models  (e)  and  (f),  so  that  the  former 
models  will  not  be  treated  further).  This  result  agrees  ^ery   well  with 
prior  expectations:  it  indicates  that  personnel  in  the  highest  and 
lowest  Mental  Groups  are  less  inclined  toward  reenlistment  than  men  in 
the  median  Mental  Groups.  It  can  be  argued  that  this  result  is  reason- 
able since  men  in  Mental  Group  I,  who  presumably  possess  greater 
intellectual  ability,  may  find  greater  rewards  and  challenges  in  civilian 
life  than  in  enlisted  military  service,  while  men  in  Mental  Groups  lower 
III  and  IV  may  often  find  themselves  unable  to  compete  for  advancement 
successfully  with  men  in  higher  Mental  Groups,  and  may  sometimes  be 
unable  to  meet  demands  of  competence  placed  on  them  by  military  service. 
For  both  the  highest  and  lowest  Mental  Groups,  then,  enlisted  military 
service  may  be  viewed  as  limited  in  opportunity.  To  establish  the 
validity  of  these  initial  observations  it  is  desirable  to  determine  if 
the  estimates  B  contained  in  Table  I  do  in  fact  estimate  different 
coefficients     3  for  different  Mental  Groups  (that  is,  whether  the 
same  coefficient  e  applies  for  all  Mental  Groups  or  whether  different 
coefficients  e..  apply  for  different  Mental  Groups). 
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Toward  this  end  a  statistical  test,  in  which  the  estimates  B  may 
be  compared  for  each  pair  of  Mental  Groups  in  each  of  the  models  (e) 
(normalized)  and  (f)  (normalized),  is  in  order.  Concentrate  now  on 

the  normalized  form  of  model  (e).  For  the  regression  analysis  of 

-2  2 

Mental  Group  i,  i  =  1,  ...5,  let  a.  be  the  estimate  of  a   ,  B.  be  the 

estimate  of  B-,  and  n.  be  the  number  of  observations.  Since  the 
estimated  intercept  for  each  least  squares  estimation  using  the 
normalized  form  of  model  (e)  is  zero,  testing  for  the  equality  of  the 
coefficients  3-  is  equivalent  to  testing  for  the  equality  of  the 

appropriate  regression  lines.  Now  if  Mental  Groups  i  and  j  yield  the 

~2 
same  regression  line  in  the  normalized  form  of  model  (e),  then  a.  and 

-2  2 

a.  both  estimate  the  same  variance  a  .  And  in  this  case, 

J 


(I-1)(J-1)(T-1)  n. 


-  1 


IJT 


x  with 


(I_l)(j_l)(T-l)  n. 
IJT 


and 


-  1  degrees  of  freedom, 


(I-1)(J-1)(T-1)  n 


-  1 


IJT 


?  (I-1)(J-1)(T-1)  n. 

x     with  J- 

IJT 


-1  degrees  of  freedom, 
where  these  two  Chi-squared  random  variables  are  independent  since  they 
are  derived  from  two  different  (and  assumed  independent)  populations  of 
random  variables.  [See  Section  IV  for  the  development  of  this  asser- 
tion. Here  I  =  61  is  the  number  of  Rates,  J  =  9  is  the  number  of  pay 
grades  and  T  =  7  is  the  number  of  years  considered.) 
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Hence  as  the  sum  of  two  independent  x  random  variables,  the 
quantity: 


1 


(I-1)(J-1)(T-1) 


IJT 


~2      -2 
n.  a.  +  n.  a. 


,*2   .2,1 


has  x  distribution  with: 

'I-1'^j'<T-D  (n.  +  n.)  .  2 

degrees  of  freedom.  Now  if  Mental  Groups  i  and  j  yield  the  same  reg- 
ression line  then  g.  -  3-  =  0,  in  which  case  B.  -  B.  is  Normally 

I        0  'J 

distributed  with  mean  zero   (since  B-   and  B.  are  unbiased  estimators  of 
$.   =  B-)  and  variance: 


Var  (Bi  -  B.)  =  Var   (B^  +  Var  (B.)   =  — 


d\2 


jfttj-X1)2         1(4 
k=l     K  k=l     K 


,m 


th 


where  for  convenience  X,    represents  the  k       observation  on  the  explana- 
tory variable  for  the  normalized  form  of  model    (e),   applied  to  Mental 
Group  m  =  i ,j.     Hence: 


Vi 


1 


1 


n . 

J 


1 


N(0,1) 


I   (xj  -  x1')2    i   (xj  -  XJ')2 
k=l  k         k=l  k 


2      2 

As  a  consequence,  under  the  composite  hypothesis  that  6.  and  a.  estimate 

2 
the  same  parameter  a  and  that  6.  =  3.,  the  quantity: 
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(B1  -  Bj} 


(I-1)(J-1(T-1)  (n  +   }  _2 

IJT      ^ni   V 


I   (Xj  -  X1) 


"j 


k=l 


S  (xjk  -  x¥ 

k=l  K 


(I-1)(J-1)(T-1) 


IJT 


-2     -2 

n.  a.  +  n .  a . 

ii    J  J 


<^>f 


(B,  -  B  ) 


n. 
l 


(I-1)^HT-1)  («,  ♦  Bj)  -  2 


IJT 


n . 


ix2 


I  (X?  -  X1) 


k=l 


k=l 


(XJ'  -  Xj)2 


2  ( 


(I-D(J-1)(T-1) 


IJT 


n.a2.+nf2. 


(^)jl 


has  t-distribution  with: 

degrees  of  freedom.  Computing  this  statistic,  for  the  normalized  forms 
of  models  (e)  and  (f)  separately,  for  each  pair  of  Mental  Groups,  I,  II, 
upper  III,  lower  III  and  IV  yields  the  results  given  in  Table  II. 

Note  that  for  yery   high  level  of  significance,  none  of  the  coeffici- 
ents B. ,  B.  (for  either  model  (e)  or  (f))  test  significantly  different 

from  each  other,  so  that  for  high  chosen  level  of  significance  the  com- 

2      2  2 

posite  null  hypothesis  that  a.  and  o.  both  estimate  common  a     and  that 

B.j  =  Bj  cannot  be  rejected.  But  note  that  the  magnitudes  of  the  computed 

t-statistics  for  the  most  part  give  credence  (especially  in  the  normalized 

form  of  model  (f))  to  the  observations  that  prompted  this  test:  the  sets 
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TABLE  II 

R 


i,j  t(R)  Mi-R/  df 


1.2  1.95  1.98  1481 

1.3  2.00  1.57  1284 

1.4  0.53  0.28  1141 

1.5  0.84  0.51  935 

2.3  0.28  0.28  1688 

2.4  1.57  1.91  1545 

2.5  0.75  1.17  1339 

3.4  1.68  1.47  1348 

3.5  0.91  0.87  1142 
4,5  0.47                                 0.32  999 

(i,j)  refers  to  the  comparison  of  coefficients  for  Mental   Groups  i   and  j 
t(R)   is  the  computed  t-statistic  for  the  normalized  form  of  model    (e). 

t(-^n-)  is  the  computed  t-statistic  for  the  normalized  form  of  model    (f) 

df  is  the  appropriate  degrees  of  freedom, 


ildi^Mi  j    +    0 . 2 


J 
of  the  t-distribution  to  the  nearest  integer. 
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{32,  33}  and  {$1,  34,  35}  of  coefficients  may  be  accepted  as  being 
different  from  each  other,  and  the  coefficients  within  each  of  these  sets 
may  be  accepted  as  being  the  same,  at  an  appreciably  higher  level  of 
significance  than  any  other  partition  of  the  set  {3I,  32,  33,  34,  35}  . 
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IV.  FEATURES  OF  THE  MODEL 

A.  A  MORE  GENERAL  CROSS-SECTIONAL  MODEL 

Consider  a  slightly  more  general  form  of  the  reenlistment  model. 

For  simplicity  in  the  derivation  of  results,  suppose  that  three 

cross-sectional  dimensions  are  involved.  Let  Y  =  X$  +  Zfi  +  e,  where 

Y  is  an  n-vector  of  observations  on  the  dependent  variable,  X  is  an  n  x  k 

matrix  of  observations  on  k  explanatory  variables,  each  of  which  varies 

over  all  cross-sectional  dimensions  (as  did  WM.  ...  in  the  reenlistment 

l  j  t 

model),  3  is  a  k-vector  of  coefficients  corresponding  to  the  variables  X, 
Z  is  an  n  x  m  matrix  of  observations  on  m  explanatory  variables,  each  of 
which  varies  over  at  most  two  cross-sectional  dimensions  (as  did  WC.. 

and  C.  for  example,  in  the  reenlistment  model),  Q   is  an  m- vector  of 
coefficients  corresponding  to  the  variables  in  Z.  Then  it  is  evident  that, 
if  the  observations  are  "normalized"  as  in  the  reenlistment  model,  the 
variables  Z  will  disappear  from  the  normalized  data.  So  the  model  in 
normalized  form  becomes  Y  =  X  6  +  e  ,  where,  for  example,  the  typical 
element  of  e  is: 

eijt  "  eij.  "  £i.t  "  e.jt  +  ei..  +  e.j.  +  £..t  "  £... 

The  procedure  of  normalizing  data  in  this  manner,  then,  is  advantageous 
when  it  is  desirable  to  rid  the  model  of  one  or  more  of  the  variables  in 
Z.  For  example,  theoretical  or  practical  considerations  may  dictate 
that  a  variable  in  Z  be  included  in  the  model,  but  this  variable  may  in 
practice  turn  out  to  be  unobserved  (as  was  WC-t  in  the  reenlistment 
model)  or  even  unobservable  (as  was  C.  in  the  reenlistment  model).  An 
obvious  disadvantage  is  that  all  the  variables  Z  disappear  in  the 
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normalized  data,  so  that  none  of  the  coefficients  in  n  can  be  estimated 
using  normalized  observations.  The  normalization  procedure  can  also  be 
used  to  advantage  to  rid  the  model  of  disturbance  terms  of  a  certain 
form.  This  is  the  subject  of  a  later  part  of  this  section. 


B.  A  NECESSARY  IDEMPOTENT  MATRIX 

Consider  the  set  of  all  ordered  triples  of  three  indices,  i,  j,  t: 

{(i,j,t):  i  =  1,  ...I,  j  =  1,  ...J,  t  =  1,  ...T} 

There  are  IJT  unique  such  ordered  triples.  Construct  an  IJT  x  IJT 
matrix,  the  rows  and  columns  of  which  are  each  indexed  with  one  of  the 

■f"h 

ordered  triples  (i,  j,  t),  as  follows:  If  the  k   row  of  this  matrix, 

call  it  V,  is  indexed  with  (i-j,  j-j ,  t-j ) ;  then  the  k   column  of  V  is  also 
indexed  with  (i,,  j,  ,  t-, ) .  For  the  row  of  V  indexed  with  (i,9  j,  ,  t, ) 
and  the  column  of  V  indexed  with  (!„,  j„ ,  t«)>  let  the  corresponding 
element  of  V  be  equal  to 


-(J-1)(T-1)/IJT 
-(I-1)(T-1)/IJT 
-(I-1)(J-1)/IJT 

(T-D/IJT 

(J-1)/IJT 

(I-D/IJT 

-I/IJT 

(I-1)(J-1)(T-1)/IJT 


if 

i- 

1  t 

i2, 

J-, 

if 

i- 

- 

\v 

h 

if 

i- 

.  = 

\v 

h 

if 

i 

1  t 

i2- 

Ji 

if 

i. 

1  i 

i2, 

Ji 

if 

i. 

,  = 

12, 

Ji 

if 

i 

I  t 

12> 

J'i 

if 

i- 

= 

12, 

Ji 

J0  5   t,   =   t 


t      J2»   t 


=   j2»   t 


^  J2,   t 


=   J2>   t 


t  J2.  t 
/  j2,  t 
=   J2»   t 


=   t, 


/   t. 


=   t, 


^   t. 


^   t, 


7*  t, 


=  t, 


)  elements  of 


Within  each  row  and  each  column  of  V,  then,  there  are  (I-' 

the  first  type,  (J-l)  elements  of  the  second  type,  (T-l)  elements  of  the 

third  type,  (I-1)(J-1)  elements  of  the  fourth  type,  (I-1)(T-1)  elements 
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of  the  fifth  type,   (J-1)(T-1)  elements  of  the  sixth  type,   (I-1)(J-1)(T-1) 
elements  of  the  seventh  type,  and  one  element  of  the  eighth  type. 

From  the  symmetrical   construction  of  V,   it  is  apparent  that  V  is 
symmetric.     That  V  is  singular  is  also  apparent,  since  VN  =  0,  where  N 
is  the  n-vector  with  unit  elements   (that  is,   the  sum  of  the  elements  in 
each  row  and  each  column  of  V  is  equal   to  zero)  and  n  =  IJT. 

And  it  can  be  shown  that  V  is  indempotent  as  well:     Let  X  be  an 
arbitrary  n  x  r  matrix.     For  convenience  of  representation,   let  the  m 
row  of  X  be  indexed  with  the  same  ordered  triple   (i,  j,  t)  as  the  m 

row  of  V.     Consider  the  kth  column  of  VX.     If  Xk  is  the  kth  column  of  X, 

k  th 

then  VX     is  the  k       column  of  VX,  so  that  without  loss  of  generality  it 

is  necessary  only  to  consider  the  case  r  =  1    in  order  to  establish  the 

form  of  VX.     Let  X.    ,  j,   t-,  be  a  typical   element  of  the  n  x  1  matrix  X. 

s  t 
The  the  (i-,,  j-, ,  t-, )       element  of  VX  is  of  the  form: 


IJT 


(I-1)(J-1)(T-1)   X.    .    .     -   (J-D(T-l)     I       X..  t 

Vri  i-i     1Jri 


m 


(i-D(t-i)  J    x.  jt    -  (i-D(J-i)    I    x  + 


j=i    'iJ'ui 
i       j 


t=i    'iJi 


I         T 


J         T 


(T-D     I         I     X  +   (J-l)     I         I     X  +   (1-1)   I         I     X     jt 

i=l     j=l     1Jtl  i=l     t=l     1Jlr  j=l     t=l     M^ 


i^i-|  j/j'i 
I         J        T 

-III     x 

i=l     j=l     t=l       1JI 
i^i-j  JYJ1   t/t-j 


i7i1  t^t1 


J«!  t^t-. 
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IJT 


IJT  X 


Vi*!       i=i    «i*i  "  IT  / 


L  xi  it 
j=i    Vn 


J       T 


I       T 


IJ 


t=i    Vr      j=i  t=i    Vx       i=i  t=i    1Jr 


i     j 


I       J       T 


t  A  i  ^    .i,  j,  j,  *w 


i=l  j=l   t=l 


Vi*i  "  *  i  ^i*i  " J  j  Xiijti  " T  t  XW  + 


1_ 

JT 


nxijt4nx,t  +  |jn  xiJt 


I3T    I    11    xiJt    = 


x.   ,    .     -  x   .   .     -X.     .     -X.    .       +  X.         +  X  ,       +  X     .     -  X 

liJiti  -Ji^         Iv*!         Vl'  It-  Ot  -.tn 


ijn 


1" 


1 


1 


That  is,  the  matrix  V  is  the  linear  transformation  which  reduces  the 
original   data  X  to  data  in  the  normalized  form. 

Now  consider  the  matrix  product  VVX.     Let  X-    .   .     be  the  typical 

o  lJ]   ] 

element  of  VVX,  and  let  X.    .   .     represent  the  typical   element  of  VX: 

Vri 


X° 


xWi  -  x.JVl 


X.      ,     -  X.    .       +  X.  + 


X   ,       +  X  -  X 


.J 


T 
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Analagous  to  the  above  derivation, 


-  X°,  t    -X°         -  X?  ,  t  +  x°       + 


X,    .   t     =       X 
iJri  '1J1 


n      'i'i      'iji 


x°,     +  x°  .    -  x° 

. J-I  .  •       1 


But: 


X°  =       X  -  X  -  X  -  X  +  X 

-  J  "I   *-"J  •JlL]  -Jl   L]  .  .   L-j  .J-|   .  ... 


X.       +X.-X         ^  0 

.Jr       ..t1 


x° 


X.      .      -   X      .      -   X.      ¥     -   X.         +  X.  + 

V*!       -ti       V*!       M"       1r* 


X         +  X     .      -  X         =  0 

•  •  •  •   •  L-l  •  •  « 


x°  . 


X.    ,       -   X    .       -   X. 


+  X 


i1Jr  -Jr  lr-  1,0-j.  ir- 


X    .       +  X         -   X         =  0 


lr. 


X,         -  X         -   X.         -   X.         +  X.  + 


v- 


V-       V*       V 


X         +  X         -   X         =  0 


•J 


X  ,      -X.      -X        -x.      +x         + 


V 


.J 


r       -Jr 


r 


x  .     +  x      -  x      =o 
•Ji 

X.-X.-X.-X         +X         +X         +X.-X         =0 

•  •   L- -|  .    .   L -i  •   *   L-t  •••  •«•  •••  •   •    L- ■]  •   •   • 


X         -X         -X         -X         +X         +X         +X         -X         =0. 
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So  that:  X.  .  +  ■  X?  .  +   . 

V^l   V^i 

In  particular  this  holds  for  the  vector  X.  which  has  zeros  in  each  ele- 
ment except  the  k  ,  which  is  equal  to  one.  That  is  VVX?  =  VX?  .  But 
VVX^  is  the  kth  column  of  VV,  and  VX°  is  the  kth  column  of  V.  This  holds 
for  each  k  =  1  ...  IJT,  so  that  each  column  of  VV  is  equal  to  the 
corresponding  column  of  V.  Hence  VV  =  V,  so  that  V  is,  by  definition, 
idempotent. 

The  idempotency  of  V  can  be  seen  equivalently  as  follows.  Consider 
the  equation  VX  =  AX,  where  A  is  any  eigenvalue  of  V,  and  X  is  a  corres- 
ponding eigenvector  (x  f   0)  by  assumption).  Pre-multiplying  both  sides 
of  this  equation  by  V  yields: 

VVX  =  VAX  =  AVX  =  A2X. 
But  VVX  =  VX  =  AX,  so  that  aX  =  A2X.  So  either  a  =  0  or  it  is  possible 
to  divide  by  A  to  get  X  =  AX.  Or  X'X  =  X'aX  =  AX'X,  where  X'X  is  a 
strictly  positive  scalar.  Hence  if  A  f   0,  then  A  =  X'X/X'X=1.  That  is, 
for  the  matrix  V,  all  eigenvalues  are  equal  to  1  or  to  0.  Now  the  claim 
that  V  is  indempotent  can  be  made,  since  a  sufficient  condition  for  a 
symmetric  matrix  to  be  indempotent  is  that  each  of  its  non-zero  eigen- 
values be  equal  to  unity. 

Now  since  V  is  indempotent,  its  rank  is  equal  to  its  trace.  And  the 
trace  of  V  is  equal  to  the  sum  of  its  diagonal  elements.  That  is,  tr(V) 

=  IJT  [(I-1)(J-1)(T-1)/IJT]  =  (I-1)(J-1)(T-1).  Hence  the  rank  of  V 
is  (l-l)(j-l)(T-l). 

C.  ORDINARY  LEAST  SQUARES  ESTIMATION  UNDER  THE  TRANSFORMATION  V 

Consider  once  again  the  model  described  in  Section  A,  Y  =  X3  +  19.   +  e 
where  Y,X,e,Z,fi  and  e  are  as  defined  there.  Recall  that  the  number  of 
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cross-sectional  dimensions  involved  was  assumed,  for  purely  illustrative 
purposes,  to  be  three.  Suppose  that  one  cross-sectional  dimension  is 
resolved  into  I  categories,  the  second  dimension  into  J  categories,  and 
the  third  dimension  into  T  categories.  Then  there  are  n  =  IJT  observa- 
tions in  Y,  and  to  each  observation  in  Y  there  can  be  assigned  a  unique 
ordered  triple  (i,j,t)  which  represents  the  appropriate  category  of  each 
of  the  cross-sectional  dimensions  for  that  observation  in  Y.  Obviously 
this  same  ordered  triple  is  assigned  to  the  corresponding  observations 
of  the  variables  in  X  and  in  Z,  as  well  as  to  the  corresponding  element 
of  e.  Now  suppose  that  the  matrix  V  has  been  constructed  so  that  the  index 
of  the  p   row  of  V  is  equal  to  the  index  of  the  p   observation  in  Y. 
Then  pre-multi plying  the  above  equation  by  V  yields  VY  =  V  X  &  +  VZfi  +  Ve, 
where  VZ  =  0~  and  VY  f   0"  f   VX  since  by  assumption  the  dependent  variable 
whose  observations  are  represented  by  Y  and  the  k  explanatory  variables 
whose  observations  are  represented  by  X  vary  over  all  cross-sectional 
dimensions,  while  the  variables  whose  observations  are  represented  by  Z 
vary  over  at  most  two  cross-sectional  dimensions.  So  the  equation 
becomes  VY  =  V  X  B  +  Ve. 

Note  that  the  above  property  provides  a  concise  operational  defini- 
tion of  the  phrase  "varies  over  all  cross-sectional  dimensions."  A  non- 
stochastic  variable  whose  vector  of  observations,  over  all  possible 
categories  of  the  cross-sectional  dimensions,  is  given  by  W  may  be  said 
to  vary  over  all  cross-sectional  dimensions  if  VW  f   0.  It  will  be  shown 
in  a  later  section  that  the  element  of  VW  which  is  indexed  by  (i,j,t)  may 
be  interpreted  as  the  three-way  interaction  of  the  i   category  of  one 
cross-sectional  dimension,  the  j   category  of  the  second  dimension,  and 
the  t   category  of  the  third  dimension.  Similarly,  for  a  stochastic 
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variable  whose  vector  of  observations  is  given  by  W,  the  element  of  VW 
indexed  by  (i,j,t)  may  be  interpreted  as  the  sample  estimate  of  this 
three-way  interaction  term. 

Now  in  order  to  discuss  the  ordinary  least  squares  estimator  of  3  in 
the  equation  VY  =  V  X  3  +  Ve  it  is  necessary  to  consider  the  rank  of  VX. 
Suppose  that  r  (X)  =  k  (k  <  n),  so  that  (X1  X)~  exists.  If  it  were  the 
case  that  r  (X)  <  k,  then  the  coefficient  vector  3  in  the  equation 
Y  =  X3  +  Zft  +e  would  be  inestimable  in  the  original  data,  since  a  necess- 
ary condition  for  the  ordinary  least  squares  estimators,  in  the  original 
data,  of  3  and  n  to  exist  is  that  both  X1  X  and  V   1   are  nonsingular. 
That  is,  these  estimators  in  the  original  data,  in  partitioned  matrix 
form, 


Lfl 


X'  X   X'Z 
V    X   Z'Z 


-1 

"X'    Y 

. 

.V   Y  . 

exist  only  if  (X1  X)"  and  (Z'  Z)~  exist.  So  the  assumption  that  r  (X) 
=  k  is  no  more  restrictive  in  the  ordinary  least  squares  estimation  of  3 
using  data  in  the  form  VY,  VX  than  it  was  in  the  ordinary  least  squares 
estimation  of  6  using  the  original  data  Y,  X.  [Note  that  this  discussion 
applies  only  to  estimation  of  the  originally  specified  k-vector  3  of 
coefficients.  It  may  of  course  be  possible,  even  if  r  (X)  <  k,  to 
estimate  a  linear  combination  of  some  of  the  coefficients  in  3.  But  this 
is  not  the  goal  here.]  Now  since  r  (V)  =  (1-1 )(J-1 )(T-1 ) ,  a  necessary 
condition  for  (VX)'  (VX)  =  X'  VX  to  be  nonsingular  is  that  r  (VX)  =  K. 
So  a  necessary  condition  is  that  K  <  (1-1 )(J-1 )(T-1 ) .  That  is,  that  the 
matrix  X  represents  observations  on  at  most  (1-1 )(J-1 ) (T-l )  explanatory 
variables.  Consequently,  in  all  discussion  hereafter,  the  requirement 
that  K  <  (I-1)(j-1)(t-1)  <IJT  =  n  will  be  made. 
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Additionally,  the  requirement  that  r  (VX)  =  k  means  that  the  columns 
of  VX  must  be  linearly  independent.  But  these  are  simply  the  vectors 
which  represent  the  three-way  interaction  terms  for  each  variable  in  X. 
This  is  a  new  restriction,  not  encountered  when  basing  estimators  upon 
the  original  observations.  It  may  turn  out,  in  some  cases,  to  prohibit 
application  of  V  in  the  model.  It  is  certainly  not  prohibitive  when  X 
represents  observations  on  only  one  explanatory  variable  (as  was  the  case 
for  WM.  ...  in  the  reenlistment  model).  It  may  be  worth  noting  that  the 
circumstances  in  which  r  (VX)  <  k  can  be  stated  more  succinctly:  r  (VX) 
<  k  if  and  only  if  some  linear  combination  of  the  vectors  in  X  is  in  the 
null  space  of  the  transformation  V. 

If  r  (VX)  =  k,  then  X'VX  is  nonsingular,  and  the  ordinary  least 
squares  estimator,  under  the  transformation  V,  for  3  in  Y  =  X3  +  Zft  +  f 
is  B  =  ((VX)'(VX))"1  (VX)'(VY)  =  (X'VX)-1X'VY. 

A  definition  of  terms  should  now  be  made.  B,  in  the  equation  above, 
has  been  called  an  estimator  for  3  under  the  transformation  V.  But  it 
is  clear  that  if  B  is  linear  in  VY,  then  it  is  also  linear  in  Y.  That 
is,  for  any  linear  transformation  A,  A(VY)  =  CY  for  some  linear  transforma- 
tion C.  The  reason  for  this  apparently  unnecessary  terminology  is  that 
this  estimator  B  is  the  best  linear  unbiased  estimator  for  3  (it  will  be 
shown  later)  among  all  those  unbiased  estimators  for  3  that  are  linear  in  VY, 
[The  definition  of  "Best"  used  throughout  this  paper  is  that  employed  in 
the  Gauss-Markov  theorem.  An  estimator  3  for  3  in  the  equation  Y  =  X3  + 
Zfi  +eis  best  linear  unbiased  if  it  is  linear  in  Y,  if  it  is  unbiased 
and  if  any  other  estimator  of  3  which  is  also  linear  in  Y  and  unbiased 
has  a  covariance  matrix  which  exceeds  that  of  3  by  a  positive  semidefinite 
matrix.]    That  B  can  be  the  best  unbiased  estimator  linear  in  VY  and 
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yet  not  be  the  best  unbiased  estimator  linear  in  Y  is  clear,  since  the 
transformation  V  is  not  invertible.  That  is,  no  linear  transformation  on 
VW  can  reproduce  W.  If  this  were  possible,  then  there  would  exist  some 
matrix  A  such  that  AVW  =  W  for  all  W.  But  since  V  is  singular,  there 
must  exist  a  vector  W-,  (not  identically  zero)  such  that  VW-.  =  0. 
Specifically,  W-.  =  N  can  be  the  n- vector  with  unit  elements.  So  AVW-,  = 
A  0   =  0^  f-   W, .  [Equivalently,  V  is  not  isomorphic.  It  has  null  space 
S  =  {W:VW  =  0}.  Consequently,  V  maps  all  vectors  of  the  form  Z  +  cN, 
where  c  is  a  scalar  and  N  the  n-vector  of  unit  elements,  into  the  vector 
VZ.]   In  addition  to  being  the  best  linear  unbiased  estimator  for  3 
under  the  transformation  V,  B  is  in  many  cases  the  best  linear  unbiased 
estimator  for  3  as  well.  This  is  the  subject  of  the  next  part  of  this 
section. 

D.   POOLED  TIME  SERIES  AND  CROSS-SECTION  DATA:   EFFECT  OF  THE  COMPOSITION 
OF  THE  DISTURBANCE  TERM  ON  THE  MODEL 

The  ordinary  least  squares  estimator  for  3,  under  V,  shows  a  degree 

of  insensitivity  in  its  quality  of  "best  linear  unbiasedness  under  V"  to 

the  composition  of  the  disturbance  term  of  the  model.  The  type  of 

composition  of  the  disturbance  term  for  which  the  property  of  best 

linear  unbiasedness,  under  V,  of  B  is  invariant  is  considered  here. 

It  may  happen  that  in  a  regression  model  involving  time  series  and 
cross-section  data  the  disturbance  term  for  an  observation  is  composed 
of  effects  due  to  the  cross-section,  an  effect  due  to  the  time  series, 
and  a  series  of  remainder  terms  (that  is,  components  of  the  disturbance 

term  which  are  due  to  the  joint  effects  of  cross-section  and  time 

4 
series).  For  example,  the  disturbance  term  e...    for  economic  entity  i, 

i  j  t 


4 
As  postulated  by,  for  example,  Kuh  [11]  and  Chetty  [12]. 
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subject  to  factor  j  at  time  t  may  be  given  by: 

]-  eijt  =  «tjt  +  ai  +  YJ  +  6t  +  xij  +  "it  +  "5V   where 

2.  E(nijt)  =0,i=1,  ...I,  j  =  1,  ...J,  t  =  1,  ...T 

3.  Var  (ni -t)  =  o     for  all  i,  j,  t 

4.  ri'-f's  are  independent,  Normally  distributed  random  variables 

1  J  L 

5.  No  statements  can  be  made  concerning  the  distributions  of  the 
random  variables  a-,  y.s   6.,  X..,  w..,  -n ..    . 

I     J     L     1 J     1  l>     J  L 

6.  No  statements  can  be  made  concerning  the  independence,  or  correla- 
tions, of  the  random  variables  r[iii.i   a-,  y-»  <$   X. .»  u..,  it.. 

I J  w     1     J     L»    IJ     It     Jt 

(other  than  as  in  4.  above) 

7.  Each  random  variable  is  invariant  over  any  dimension  not  included 
as  a  subscript  in  its  notational  expression. 

The  disturbance  structure  hypothesized  here  is  central  to  later  work. 
For  ease  of  reference,  call  the  error  structure  formally  assumed  by 
statements  1.  through  7.  above  "disturbance  structure  (A)." 

Under  the  specifications  of  disturbance  structure  (A),  no  conclusion 
can  be  made  about  the  form  of  E  (e)  or  Var  (e).  Consequently  no  claims 
can  be  made  regarding  the  unbiasedness  of  the  ordinary  least  squares 
estimator  for  6  in  the  original  data.  And  the  generalized  least  square 
estimator  is  unknown,  since  Var  (e)  is  unknown.  But  for  e  =  [e,-4+]  and 
n  =  [n-j-jf]  as  specified  above,  Ve=  Vn  ,  since  Va  =  Vy  =  V6  =  Vx  =  Vu 
=  Vtt  =0.  Hence  under  disturbance  structure  (A)  the  ordinary  least 
squares  estimator,  under  V,  for  6  is  unbiased: 
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B  =   (X'VX)'1    X'VY 
E(B)   =  ELU'VXrVvY]     =  EE(X'VX)"1   X'(V  X   3  +  Ve)]     = 
3  +   (X'VX)"1   X'V  E(n)     =  3  +  0  =     3     . 
And  the  variance  of  B  is  given  by: 

Var  (B)  =       E[(B-3)(B-3)']  = 

E[(X'VX)_1    X'Vee'    VX(X'VX)"1]  = 

EL(X'VX)'1   X'Vnn'   VX(X'VX)"1]  = 

(X'VX)"1    X'VE(nn')   VX(X'VX)"1  = 

a2   (X'VX)"1    X'VIVX(X'VX)"1    = 

a2   (X'VX)"1    X'VX(X'VX)"1    =  a2  (X'VX)"1, 

since  E(nn')  =  o  I,  and  since  V  is  idempotent. 

It  is  now  possible  to  show  that,  under  disturbance  structure  (A), 

B  is  the  best  linear  unbiased  estimator,  under  V,  for  3-  But  it  is  first 

worthwhile  to  show  that  any  linear  transformation  which  has  null  space 

identical  to  that  of  V  (that  is,  any  linear  transformation  which  maps 

precisely  the  same  vectors  onto  the  null  vector)  is  itself  a  linear 

transformation,  under  a  nonsingular  matrix,  of  V.  That  is,  that  the 

matrix  V  which  removes  the  stochastic  variables  a.,  y.,  6.  X..,  to.,  and 

i  "j   t,  ij'  it 

ir..  from  the  disturbance  term,  and  under  which  the  image  of  a  vector 
[n--t]  which  varies  over  all  dimensions  is  non-null,  is  unique  up  to  a 
nonsingular  linear  transformation  C.  Suppose  there  exists  another  linear 
transformation,  say  A,  such  that  Ae  =  An  (Act  =  Ay  =  A6  =  Ax  =  Aw  =  Au  =  0) 
for  all  n-vectors  e.  Then  since  A  and  V  are  to  have  the  same  null  space, 
AX  =  0  if  and  only  if  VX  =  0.  In  particular,  this  must  hold  for  the  vec- 
tor VX:  AVX  =  0",  if  and  only  if  VVX  =  VX  =  0.  An  equivalent  statement  is 
that  the  system  A(VX)  =  0  has  only  the  trivial  solution  VX  =  0.  Hence 
either  A  is  nonsingular  or  A  =  CV  for  nonsingular  C  (in  the  latter  case 
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AVX  =  CVVX  =  CVX  and  AX  =  CVX).  But  if  A  is  nonsingular,  then  AX  =  0 
implies  that  X  =  0.  So,  for  nonsingular  A,  A  and  V  could  not  have  the 
same  null  space.  Hence  A  =  CV,  for  nonsingular  C. 

Now  since  CV,  for  nonsingular  C,  is  the  only  linear  transformation 
which  removes  stochastic  variables  a.,  p.,  y.      \..,   m.  ,  n..   from  the 

1     J     *- ,    1  J     H    JL 

model,  any  other  unbiased  estimator  of  8  must  be  linear  in  CVY,  hence 

in  VY.  Consider  any  other  such  estimator,  say  AVY,  where  A  is  a  k  x  n 

matrix  independent  of  Y. 

Let    D  =  A  -  (X'VXrVv. 

Then    AVY  =  [D  +  (X'VX)_1X']  VY  = 

[D  +  (X'VXrV]  [V  X  8  +  Ve]  = 

[DVX  +  I]  8  +  [D  +  (X'VXrV]  Ve. 
But     E(AVY)  =  (DVX  +  I)  8  +  [D  +  (X'VX)_1X']  E(Ve)  = 
(DVX  +  I)  8  +  [D  +  (X'VXrVv]  E(n)  = 
(DVX  +  I)  8. 

So  in  order  for  AVY  to  be  unbiased,  it  is  necessary  that  DVX  =  0.  So  the 
estimator  becomes  8  +  [D  +  (X ' VX)~  X']  Ve.  The  corresponding  sampling 
error  is  [D  +  (X'VX)"  X1]  V  e,  and  the  covariance  matrix  is: 

E[{DV  +  (X'VXTVv  }Vee'V{VD'  +  VX(X'VX)"1  }]  = 

[DV  +  (X'VXrVv]  E(nn')  [VD*  +  VX(X'VX)"1]  = 

a2  [DV  +  (X'VXrVv]  [VD1  +  VX(X'VX)'1]  = 

a2  [DVD1  +  DVX(X'VX)"1  +  (X1 VX)_1X' VD '  +  (X'VX)'1X,VX(XI VX)"1 ]  = 

a2  [DVD1  +  (X'VX)"1]. 

So  the  covariance  matrix  of  the  estimator  AVY  exceeds  the  covariance 
matrix  of  B  =  (X'VX)  X'VY  by  DVD1,  a  positive  semidefinite  matrix.  Hence 
B  is  the  best  linear  unbiased  estimator  under  V  in  the  sense  that  its 
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covariance  matrix  is  exceeded,  by  a  positive  semidefinite  matrix,  by  the 
covariance  matrix  of  any  other  linear  unbiased  estimator  of  3  under  V. 

And,  since  B  is  the  best  linear  unbiased  estimator  for  3  under  V, 
and  since  only  those  estimators  linear  in  VY  can  claim  to  be  unbiased, 
the  estimator  B  is  the  best  linear  unbiased  estimator  for  3  under 
disturbance  structure  (A). 

The  discussion  of  the  hypothesized  error  structure  has  been  couched 
in  terms  of  pooled  cross-section  and  time  series  data.  But  in  any 
regression  model  involving  cross-sectional  data  (no  matter  what  the  nature 
of  the  cross-sectional  dimensions)  it  is  clear  that,  if  no  more  specific 
statement  about  the  error  structure  can  be  made  than  that  disturbance 
structure  (A)  applies,  then  B  =  (X'VX)  X'VY  is  the  best  linear  unbiased 
estimator  for  3. 

E.  AN  UNBIASED  ESTIMATOR  FOR  a2 

Assume  disturbance  structure  (A)  from  the  preceeding  section  applies. 
The  purpose  of  this  section  is  to  show  that: 

S2  =  e'e/[(I-l)(J-l)(T-l)-k] 

2 
is  an  unbiased  estimator  for  a     in 

Var   (B)    =   (X'VX)"1    a2    . 
Consider  the  estimator  B  =   (X'VX)"  X'VY  of  6  in  the  model: 

Y   =   X3  +  Zfi  +  e,    VY   =   V   X  3    +  Ve. 

The  residual  vector  is  e  =  VY  -  VXB  =  VY  -  VX(X'VX)"1X,VY  = 
[V-VX(X'VX)"1X'V]  Y.  Let  M  =  V-VX(X'VX)"1X'V.  Then  e  =  MY  and  M  is  an 
idempotent  matrix  with  trace  (1-1 )(J-1 )(T-1 )-k.  To  see  the  idempotency 
of  M: 
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MM  =   [V-VX(X'VX)_1X'V]   [V-VX(X'VX)"1X'V]     = 
V-VX(X'VX)-1X'V  -  VX(X'VX)_1X'V  +  VX(X'VX)"1X,VX(X,VX)"1X,V     = 
V-VXU'VXTVv  -  VX(X'VX)"'X'V  +  VX(X'VX)_1X'V     = 
V-VXCX'VXJ^X'V  =  M. 

To  see  tr  (M)  =  (1-1 )(J-1 )(T-1 )  -  k: 

Since  the  trace  of  the  difference  of  two  matrices  is  equal  to  the 
difference  of  the  traces, 

tr(M)  =  tr(V)  -  tr(VX(X'VX)"1X'V)  = 

(i-i)(j-i)(t-i)  -  tKvxU'vxrVv)    . 

And  since  for  two  matrices  A,  B,  of  compatible  order,  tr  (AB)  =  tr(BA), 

tr(M)  =  (I=1)(J-1)(T-1)  -  tr((X,VX)"1X'VX)  = 

(I-1)(J-1)(T-1)  -  tr(Ik)  =  (I-1)(J-1)(T-1)  -  k, 

where  I.  is  the  identity  matrix  of  order  k. 

The  residual  vector  may  also  be  written,  e  =  MY  =  MVY  =  MV  (Xg  +  e) 
=  MVe,  since  MVX  =  VX  -  VX(X'VX)"1X'VX  =  VX  -  VX  =  0. 

So  the  error  sum  of  squares  is  e'e  =  e'VM'MVe  =  e'VMVe  =  n'VMVn  = 
n'M  n»  since  Ve  =  Vn.  And,  since  n'Mn  is  scalar,  it  is  equal  to  its  own 
trace:  e'e  =  tr(n'Mn).  And  since  tr(AB)  =  tr(BA),  e'e  =  tr(n'Mn)  = 
tr(Mnn').  And  since  the  trace  of  a  square  matrix  is  a  linear  operation 
on  the  matrix,  the  expected  value  of  the  trace  is  equal  to  the  trace  of 
the  expected  value: 

E(e'e)  =  E[tr(Mnn')]  =  tr[E(Mnn')l  =  tr[ME(nn')]  =  tr[a2MI]  = 

tr[a2M]  =  a2  tr(M), 
since  for  a  scalar  k  and  matrix  A,  tr(kA)  =  k  tr(A). 
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So  E(e'e)  =  a2  [(1-1 )(J-1 )(T-1 )  -  k  ]. 

So,  for  S2  =  e'e/[(I-l)(J-l)(T-l)  -  k],  E(S2)  =  a2  . 

F.  THE  JOINT  DISTRIBUTION  OF  B  AND  S2 

A  theorem  with  application  in  statistical  analysis  may  be  expressed 
as  follows:  If  A  is  an  idempotent  matrix  and  y  is  an  n-variate  Nromal 

random  variable  from  a  N(0,a  )  distribution,  then  the  quadratic  form 

1  2 

-yy'Ay  is  distributed  x  with  q  degrees  of  freedom  ,  where  q  =  tr(A)  = 

5 
rank  of  A.    This  theorem  can  be  applied  to  the  results  of  the  proceed- 
ing section  which  showed  that  e'e  =  n'Mn  ,  where  M  is  idempotent  and  the 

elements  of  n  are  independent  identically  distributed  Normal  random 

2  2 

variables,  each  with  mean  zero  and  variance  a  .  By  the  theorem,  e'e/a 

is  distributed  x2  witn  (I-1)(J-1)(T-1)  -  k  degrees  of  freedom. 

Now  consider  the  estimator  B  for  e.  It  has  already  been  shown  that 

E  (B)  =  6  and 

Var  (B)  =  a2(X'VX)_1. 

And  B  =  (X'VXrVw  =  (X'VXTV V(Xb  +  e)  ■ 


(X'VX)"1X'VX3  +  (X'VXrVve  = 
3  +  (X'VXrVv  n. 


So,  since  B  is  linear  in  the  components  of  n»  B  has  a  multivariate  normal 
distribution  also 

B  ~  N(e,  a2(X'VX)_1  ). 

It  can  now  be  shown  that  the  Chi-square  and  Normal  distributions  described 

2         2 
above  are  independent.  Note  that  e'e/a  =  n'lW  a  is  an  idempotent 


5 

For  a  proff  of  this  theorem,  as  well  as  of  the  converse  implication, 

see  Hogg,  R.,  and  Craig,  A.,  Introduction  to  Mathematical  Statistics, 

pp.  348-351,  MacMillan,  1965. 
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quadratic  form  in  n,  and  that  B  =  g  +  (X'VX)"  X'Vn  is  a  vector  whose 
elements  are  linear  in  n»  where  the  components  of  n  are  independent 
identically  distributed  random  variables.  A  sufficient  condition  for 
e'e/c  and  B  to  be  statistically  independent  is  that  the  product  of 
(X'VX)"  X'V  and  M  be  equal  to  the  null  vector.   That  this  is  so  is  easily 
verified: 

[(X'VXrVv]  M  = 

[(X'VXrVv]   [V-VX(X'VX)'1X'V]  = 

(x'vxrVv  -  (x'vxrVvxu'vxrVv  = 

(X'YX^X'V  -  (X'VXrVv  =  0  . 

2 

Hence  e'e/a  and  B  are  independent. 

Now  since: 

-2         e'e  a  e'e 


[(I-1)(J-1)(T-1)  -  kj  "  [(l-l)(J-l)(T-l)  -  kj      1 

2       2 
is  linear  in  e'e/a  ,  S  and  B  are  independent  as  well. 

As  a  consequence,  it  is  now  possible  to  get  a  joint  distribution  of 

S  and  a  linear  combination  of  the  components  of  B.   Now  B  -  s  ~N  (0, 

a2(X'VX)_1).  Let  W  be  a  k-vector  of  constants. 

Then    W'(B-B)  ~  N(0,  W (X1 VX)_1Wa2) . 

And    W'(B-g)  V2~     H(0.1). 

[a2  W'(X'VX)_1W] 


For  a  proof  of  this  assertion,  see  Theil,  H.,  Principles  of 
Econometrics,  pp.  83-84,  Wiley,  1971. 
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2 

So  that,  since  B  and  S  are  independent, 

^4^ nv[(w)(w)(T-i)  -  k] 

[£  wu'vxr'w]^ ^__         w.(B-e) 

«  o  1/2         =  ;   1/2 

{[(I-1)(J-1)(T-1)   -   k]   SVcr   }  S[W'(X'VX)'IW] 


has 


t-distribution  with   (1-1  )(J-1  )(T-l)-k degrees  of  freedom. 

So  a  confidence  interval   for  W'6,   is  a  linear  combination  of  the 

elements  of  3,   is  given  by 

1 

W'B  ±  t,  a  S  {W'U'VX)-1!^  2 
'"  2 

+  h 

where  t-,  a_  is  the  100  (1-a)   percentile  of  a  t-distribution  with 

'"2 
(I-1)(J-1)(T-1 )  -  k  degrees  of  freedom. 

In  particular  this  holds  for  a  vector  W  which  has  zeros  in  each 

component,  except  for  the  p   element  which  is  equal  to  one.  Applica- 

tion  of  this  vector  W  will  give  a  confidence  interval  for  the  p 

component  of  8,  p  =  1,  ...k. 


G.  AN  ALTERNATE  DERIVATION  OF  V 

The  calculations  which  yield  the  elements  of  the  matrix  V,  introduced 
in  Section  B  ,  may  not  be  apparent.  The  purpose  of  the  present  section 
is  to  delinate  the  sequence  of  steps  that  lead  to  the  elements  of  V. 

As  a  vehicle,  consider  a  disturbance  term  of  the  form,  once  again, 

(1)    £ijt  =  nijt  +  ai  +  Yj  +  6t  +  Xij  +  wit  +  ujt'  where  nothing  is 
known  or  can  be  reasonably  assumed  about  the  components  of  the  e-,-tls 

except  that  the  n.-.'s  are  independent  Normal  random  variables,  each 

with  mean  zero  and  variance  a  « 
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Now 
(2) 


U 


T  I  eijt  =  T  I  nijt  +  ai  +  Yj  +  T 


fit  + 


xi  i  +  T  £  wi 


t  +  T     [  *Jt 


(3) 


(4) 


(5) 


(6) 


(7) 


i.t      J 


J  I  xij  +  wit  +  J  I  *Jt 
j  j 


■jt 


=  T  l£ijt  =  I  I  nijt  +  T  I  ai  +  Yj  +  5t  + 


ll  Ai.i  +  jl  uit  +  *jt 


i  . . 


JT 


I  eijt  =  J~  frljt  +  ai   +  J  I  Yj  +  5t  + 
J  J  j 


1 


J^ijt-JrJI'Mjt'i^Wpt* 

I  *ij  +  T  J  -It  +  3T  Ht  -jt 


\TlUm-hll-nt  +  \lai^i+rlh  + 


i   t 


1   t 


tI  xij +  nl\  wit +  r  I  *jt 


J  I  I    ei jt  =  TT  H  nijt  +  I  X  ai  +  J  I  yj  +  «t  + 
- 1  j  j  j 


1 


1 


H  Aij  +  l£  wit  +  7  I  *jt 


i  J 
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(8)      -...-TJrnx-ut-wrnhjfTl-i*!?^* 

}  i  «t  +  ij  n  xij +  It  n  un +  or  1 1  *jt  • 

t  1  J  11  J  t 

Adding  and  subtracting  (l)-(2)-(3)-(4)+(5)+(6)+(7)-(8),  the  disturbance 
term  for  the  ijt   observation  in  normalized  data  becomes: 

U  •  ..     =    £  •  ..-£..        -£•.-£-.+£•  +£•        +    £        .     -     £  = 

Hjt  ljt  IJ.  l.t  .Jt  1..  .J.  ..t 


nijt    "     ^ij.     "     ni.f     \jt    +    T1i..     +    \j.     +    ^..t"     n, 


1     ,IJT^jfJT^ijt-  IT]  ^1Jt  -  U  Inijt  + 


IJT 


I  Hnijt  +  JllTiijt  +  Tnnijt-  nin,tj      ■ 
j  t     1Jt  i   t     1JX  i  j     1JL      i  j  t      J    J 

The  equations   (2)  through   (8)  above  v/ere  written  out  in  the  inconvenient 
summative  form  to  make  obvious  the  fact  that  the  variables  a-,  y-,   6., 

1    J    u 

X--i   to.,  and  7T..  disappear  completely  from  the  disturbance  term  of  the 
normalized  model.  This  is  so  since  the  equations  (1)  through  (8)  are 
written  in  terms  of  the  random  variables  themselves,  not  in  terms  of 
realizations  of  these  random  variables.  These  random  variables  also 
disappear,  of  course,  in  the  event  that  one  or  more  of  them  is  degenerate, 
as  might  happen  if  an  unobservable  explanatory  variable  were  implicitly 
included  in  the  disturbance  term  e.... 

The  expression  for  u. ..  consists  of  adding  and  subtracting  various 
multiples  of  given  random  variables.  But  in  this  expression  any  random 
variable  t\.    .   .  may  be  included  under  more  than  one  summation  sign. 

Vcro 

Concentrate  on  one  normalized  disturbance  term,  say  y.  •  .  ,  and  rearrange 

Vri 

terms  in  the  series  of  summations  so  that  each  random  variable  n... 

i  j  t 
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appears  once  and  only  once  in  the  expression  for  y.    .  +   : 

yi    i   t     =  TTf     IJT  ni    it     "  JT  I  nii   t     '    IT  I  ni    it     " 
i1J1t1        IJT    (  i1J]t1  .      ij1t1  ■      Ijjtj 

U  I  ni    ,  t  +  I  11  ni   ,t  +  J  H  nii  t  +  T  H  nin 
t    Vr         3  t    V1  i  t      Jl  i  j     1Jtl 

1  J  t 

TJ(I-1)(J--|)(T-1)  n,   ,  t    -  (J-D(T-l)     I     n,,  t      " 
IJT  (  i1J1t1  i        ij1t1 

(I-D(T-l)     I    n.   ,t    -  (I-D(J-l)     I  n,   /J1  + 

M,  t?«t1 

(T-D     I        I     n..t    +  (J-l)     I        I     ni1  t+  (I-D     I        I      ni    u 
i        j      1Jtl  i        t        Jl  J        t        Vr 

i^i-l  j7j-|  i7i-|  t/t]  j7j1   t^t-j 

ijM-,  J7J-|   t^t] 

So  that  u.  .  .  is  a  series  of  summations  of  independent,  identically 
distributed  Normal  random  variables. 

Since  each  of  these  random  variables  n--t  has  mean  zero  and  variance 

2 

o   ,  it  is  clear  that: 

Eta    .-  t  )  -  0 

Vri 


varSWpW  var|(I"1,(w,(T"1)  Vi 


(J-D(T-1     I     ni1  t    -  (I-D(T-l)     I     n,    it    -  (I-D(J-l)     I     n,   i  t 
i^i-l  j?j-|  t?«t1 
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+  (T-l)     I        I  n,,t    +  (J-D     I        I    n..  t+  (I-D     I        I    n,  jt 

-  \    ]    l^\- 

i7i1  jYj^  t^t] 
(w)2  |[(I-1)(J-D(T-1]2  Var  (n,^)  +  [(J-l  )(T-1)]2  Var/  £     ^ 

+  [(I-D(T-l)]2Var/[    niiJt\+[(I-D(J-l)]2    J    n^t    - 


(T-D'Varn        I     nijt\MJ-l 
\i7i-l  J7j-|  / 


-D^Var/J        I^ 


i^i-,   t?«t1 


(i-i)2varn  ^M+(i)2varn  s  h^i 


j«!  t/t1 


J7i-|  J7j-|  t^t, 


(ijt)     (t(i-i)J-D(T-i)]2  +  [(J-D(t-i)]2  (1-1)    + 


[(I-D(T-l)]2   (J-l)     +     [(I-D(J-l)]2   (T-l)     +   (T-1)2(I-1)(J-1)     + 


(J-1)2(I-1)(T-1)   +   (I-1)2(J-1)(T-1)     +   (I-1)(J-1)(T-1) 


of   (I-1)(J-1)(T-1) 
IJT 
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Note  that  this  applies  for  all  y--t.  And  since  y...  is  a  linear  combina- 
tion  of  independent,  identically  distributed  Normal  random  variables, 
y. ..  is  also  Normally  distributed. 

Note  that  the  diagonal  elements  of  the  covariance  matrix  E(yy')  are 
each  a  (1-1  )(J-1  )(T-1  )/IJT.  But  also  note  that,  since  each  of  the  y^^'s 
is  a  linear  combination  of  the  same  IJT  random  variables  n--t,  i=l,...I, 
j=l,...J,  t=l,...T,  the  Hj.«+'s  are  not  independent. 

The  remainder  of  the  covariance  matrix  may  be  found  by  straightforward 

but  tedious  calculations.  Since  E(y..,)  =  0,  these  calculations  (using 

l  j  t 

the  summative  expression  in  the  r\. .  's  for  each  y.  .   )  yield 

ljt  1lJltl 

COV  (y,  .  +  ,  y4  ,  +  )   =  E(y.  ,  f      y.  .  .  )   = 

Vi  1  12J2I2 


VlV  i2J2t2 


-LH1CMV 


IJT 

-(I-D(T-l)a2 

IJT 

-(i-D(J-l)a2 

IJT 

(T-l)a2 

IJT 

(J-l)a2 

IJT 

(I-Da2 

IJT 

2 

-a 

IJT 


if  i 
if  i 
if  i 
if  i 
if  i 
if  i 
if  i 


^  2 '  ^1   Jp't 


-  lo'  ^l  '  ^2 ' 


"I 2  »  J]  '  J? ' 


f  loj  Ji  r  J  o  >  t 


2'  Jl 
2'  J'l 


f  1  o  >  J  1   "J' 


,  t 


1  o  >  J'l  '   Jp »  t 


t     1*9'  Jl  ^  J9>  t 


=  t. 


t   t, 


=  t, 


t  t, 


^  t. 


/  t. 


So  that,  for  the  matrix  previously  defined,  c  V  =  E(yy'). 

H.  THE  CASE  WHEN  FEWER  THAN  IJT  OBSERVATIONS  ARE  USED 

Suppose  the  components  of  the  disturbance  term  are  independent 
identically  distributed  Normal  random  variables  with  mean  zero.  Then 
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for  ordinary  least  squares  estimation  in  the  original   data  the  quantity 

2       2  2 

(IJT  -  k)  Sn  /a     has  x     distribution  with   IJT  -  k  degrees  of  freedom, 

2  2 

where  S~  =  e'e/(IJT  -  k)   is  the  estimator  in  the  original   data  of  a   . 

When  normalized  data  are  used  the  quantity: 


[(I-1)(J-1)(T-1)  -  k]     \    = 


i(I-l)(J-D(T-l)    nj     k 

j        nrr         1JTj  ~k 


has  x2  distribution  with  (1-1  )(J-1  )(T-1 )  -  k  =  H-"1)  (j-j)(T-l )  IJT  _  k 

2  2 

degrees  of  freedom,  for  S  the  estimator  of  a  previously  derived.  In 

addition,  the  latter  distribution  still  applies  when  disturbance 

structure  (A)  is  assumed.  An  analagous  relationship  holds  when  n  <  IJT 

observations  are  used  in  the  least  squares  estimation  (such  a  case  might 

arise  when  some  observations  must  be  discarded  for  one  reason  or  another), 

In  this  case,  for  ordinary  least  squares  estimation  in  the  original  data 

2    2     2 
the  quantity  (n  -  k)  S«  /  a  has  x  distribution  with  n  -  k  degrees  of 

freedom.  It  is  desired  to  show  the  analagous  distribution  (in  S  )  when 

normalized  data  are  used.  But  when  not  all  observations  are  allowed, 

the  method  of  "normalizing"  the  remaining  observations  is  not  obvious. 

The  most  straightforward  approach  is  to  take  the  appropriate  means,  in 

the  normalization  process,  over  those  observations  that  are  available. 

J-  L. 

Then,  for  example,  the  normalization  of  the  (i,j,t)   observation  on 
the  dependent  variable  (which  is  assumed  to  be  used)  still  has  the  form: 


ijt 


"  yij.  "  yi.t  "  y.jt  +  yi 


+  y.j.  +y..t 


-  y 


where  now 

(*)   y 


ij. 


l 


-  I   yiJt 

T(i,j)||   teT(i,j) 
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1 


i.t 


Y.jt 


l. . 


|J(i,t) 

1 


I(j,t) 


|J(i.t) 


Kj.t) 


..t 


KJ.t) 


KJ.t) 


JeJ(i.t) 


I  yijt 
tel(J.t) 


1 


|T(1,j)|| 


T(1.J) 


J(1.t) 

1 


jeJ(i.t)   teT(1,j) 


I  I    yUt 

iel(j.t)   teT(i.j) 


I  I    yiJt 

lel(j.t)   jeJ(i.t) 


I 


I 


I     yijt   , 


J(i»t)||.||T(i,j)|   iel(j.t)  jeJ(i.t)  teT(i.j) 


where,  for  example,  T(i,j)  is  the  set  of  all  years  in  which  the  observa- 
tions of  y...,  for  Rate  i  and  pay  grade  j,  are   used  and  |T(i,j)|  is  the 
i  j  t 

number  of  elements  in  T(i,j).  The  normalized  value  of  any  observation 
which  is  not  used  in  the  least  squares  estimation  is  taken  to  be  zero. 
The  same  form  applies  for  normalization  of  the  explanatory  variables  in 
X.  With  a  little  reflection  it  is  seen  that,  in  effect,  this  normaliza- 
tion process  implicitly  takes  the  value  of  an  unused  observation  of  any 
variable  to  be  the  sum  of  the  appropriate  means  over  observations  which 
are  in  fact  used.  That  is,  an  unused  observation  y...  is  taken  to  be 
equal  to: 


y  • ..  =  y .  .    +y-4.+y-+-y- 
ut   Jij.   Ji.t   J.jt  Ji. 


•  J. 


y..t +  y, 


where  the  terms  on  the  right  hand  side  of  this  equation  are  as  given  in 
(*)  above.  In  particular,  this  modified  normalization  process  is  applied 
to  the  disturbance  terms  e. ...  as  well.  Let  y  represent  the  n-vector 
(n  <  UT  is  the  number  of  observations  used)  of  disturbance  terms  under 
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the  modified  normalization.  Define,  as  in  the  preceeding  section, 

Vq  =  ~2     E(vv')   » 

o 

where  the  matrix  VQ  has  order  n  <  IJT.  Note  that  the  diagonal  element 
of  Vq  which  corresponds  to  observation  (i,j,t)  is  equal  to: 

(||Kj,t)||-l)(||J(i,t)||-l)(||T(i,j)||-l) 
||l(j\t)|H!J(i,t)||.||T(i,j)|| 

since  it  represents  the  variance  of  a  component  of  y  derived  through  the 
modified  normalization  specified  in  (*)  above.  Thus,  the  trace  of  VQ  is 
equal  to: 

(||I(j,t)||-l)(||J(i,t)||-l)(||T(i,j)||-l) 

I  I  I  

ieUI(j,t)  JeUJ(i,t)  teUT(i,j)     | | I(j ,t) | | • | | J(i ,t) | | ■ | |T(i ,j ) | | 

Note  also  that  VQ  is  symmetric  and  that  for  an  arbitrary  n-component 
disturbance  vector  e,  VqVqe  =  V^e  ,  so  that  VQ  is  idempotent.  That 
this  is  so  is  clear  since  for  e . .  ,  e •  . ,  e  -+,  e-   ,  e  .  ,  e  f  and 

1  J  •     l.L     . J L j    I . •     .J.     .  .  L 

e    as  specified  in  the  equations  (*),  Vn  e .  .  =  Vn  e-  .  =  Vn  e  .f  = 

Vn  e.   =  V  e  •  =  V  e  .  =  V  e    =  0\  The  matrix  V  has  properties 
Oi..    o.j.    o..t    o...  o 

analagous  to  the  matrix  V  considered  previously,  and  represents  the 
linear  transformation  which  projects  an  n-vector  of  observations  into 
the  modified  normalization  of  that  vector. 
Now  let  N(n)  =  tr(V  )  = 

(||I(j,t)||-l)(||J(i,t)||-l)(||T(i,j)||-l) 

I  I  I  

ieUI(j.t)  jeUJ(i,t)  teUT(i.j)    |  |I(j,t)  1 1  .|  |J(i,t)|  |  -|  |T(i,j)  1 1 

and  let  MQ  =  VQ  -  VQ  XCX.'VqXJ^X'Vq,  where  X  is  now  the  n  x  k  matrix  of 
observations  which  results  from  removing  the  IJT-n  unused  observations 
from  the  original  IJT  x  k  matrix  of  observations  X.  Then  the  error  sum 
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of  squares  for  the  least  squares  estimation  in  modified  normalized  form 


of  the  data  (with  unused  observations  removed)  is  e'e  =  e'M  e  where  M 
is  an  idempotent  matrix  of  rank  N(n)  -  k.  That  M  is  idempotent  is 
clear  since  \A^Q   =  [VQ  -  V0X(X,V0Xr1X'V0][V0  -  V^X'V^rVv^  = 

vo  "  v0x(x,v0x)"1x,v0  -  vox(x,v0x)"1x'v0  +  v0x(x'v0x)"1x,v0x(x'v0x)"1x,v0 

Vo  "  V^'V^^o  =  V  And  Mo  has  trace  (nence  rank)  N^n)  ~  k  since 


tr(M) 


tr[VQ  -  VqXCX'VqXJ^X'Vq]  = 
,-lv 


tr(VQ)  -   tr[V0X(X'V0X)  *X'V0]  = 
tr(VQ)  =   trCX'V^CX'V^)'1]  = 


N(n)  -  k.  Hence  for  disturbance  term  e   specified  by: 


ijt      nijt 


+   OC  •    +    Y  ,•    +    fi  +•    +    ^  ,•  ,•    +   ^  -;  +■    +    TT  • 


1J 


'it       "jt 


where  n^^'s  are  independent  identically  distributed  Normal  random 

2  1  2 

variables  with  mean  zero  and  variance  a  ,  -y  e'M  e  has  x  distribution 

a 

with  N(n)  -  k  degrees  of  freedom.  Thus,  for  the  estimator: 


s2  = 


e'e 


*v 


N(n)  -  k    N(n)  -  k 


of  a1 


2  2      2 
[N(n)  -  k]  S  /o     has  x  distribution  with  N(n)  -  k  degrees  of  freedom. 

For  those  cases  in  which  the  removal  of  observations  is  not  systematic 
(that  is,  when  observations  are  discarded  in  no  regular  pattern),  computa- 
tion of  N(n)  may  involve  many  computations  and  may  require  that  one  keep 
track  of  a  large  number  of  values  of  |I(j,t)||,  |J(i,t)|  and  |T(i,j)||. 
It  may  therefore,  be  beneficial  to  derive  the  distribution  of  an  alternative 


random  variable  linear  in  S  .  The  quantity: 


(I-1)(J-1)(T-1)  n   . 
IJT _\ 

N(n)  -  k 


N(n)  -  k 


(I-D(J-D(T-l)n 


IJT 
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is  linear  in 


CN(n)  -  k]  S' 


? 

hence  has  x  distribution,  with  degrees  of  freedom  given  by 


(I-D(J-D(T-l)n 


IJT 


-  k 


(I-D(J-D(T-l)n 
IJT 


E   2 


(I-D(J-D(T-l)n 

IJT         -  k    . 

Thus  the  analogy  is  completed. 

I.  GENERALIZATION  TO  q  CROSS-SECTIONS 

There  is  a  natural  generalization  of  all  of  the  preceeding  sections 
to  the  case  in  which  q  cross-sectional  dimensions  are  involved. 
Previously,  recall,  all  was  described  in  terms  of  three  cross-sectional 
dimensions. 

Suppose  q  cross-sectional  dimensions  are  being  considered  in  the 
model  Y  =  X$  +  Zfi  +  e.  Analagously  to  the  case  for  q  =  3,  let  the 
variables  whose  observations  are  represented  by  X  and  Y  vary  over  all 
q  dimensions,  and  let  each  variable  in  Z  vary  over  at  most  q  -  1  dimen- 
sions. Also  let  the  disturbance  term  e  be  constructed  analagously  to 
the  previously  considered  case,  q  =  3.  That  is,  for  q  cross-sectional 
dimensions,  with  respective  numbers  of  categories  I -.,... Iq ,  e  is  a 


linear  combination  of: 


q 

TT   I 

k=l 


k 


random  vectors,  one  of  which  varies  over  q  cross-sectional  dimensions 
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(let  this  single  random  vector  be  denoted  as  n,  as  before,  where  the 
elements  of  n  are  written  with  q  subscripts)  and  the  remaining 

q 

tt  I.  -  1 
k=l  K 

of  which  vary  over  at  most  q  -  1  dimensions  (that  is,  the  elements  of 
each  of  these  remaining  random  vectors  are  written  with  fewer  than  q 
subscripts).  Also,  the  elements  of  n  are  independent,  identically 
distributed  Normal  random  variables,  each  with  mean  zero  and  variance 


a   ,  and  the  remaining 


tt  I.  -  1 
k=l  K 


random  variables  are  subject  to  any  unknown  distributions,  and  to  any 
unknown  conditions  of  stochastic  non-independence. 

All  the  properties  that  have  been  derived  in  preceeding  sections 
flowed  naturally  from  a  knowledge  of  the  idempotent  matrix  V.  Thus,  in 
order  to  characterize  the  general  case  for  q  cross-sectional  dimensions, 
it  is  only  necessary  to  find  the  appropriate  matrix  V  whose  properties 

H 

are  analagous  to  those  of  the  previously  defined  V.  To  this  end,  let 
C--  be  the  subscript  (in  the  notational  expression  for  the  elements  of 
n;  there  are  q  such  subscripts  in  the  notational  expression  for  each 
element  of  n)  representing  the  i   category  of  the  j   cross-sectional 
dimension,  j  =  l...q,  i  =  1,...I.. 

Then  the  elements  of  V  =  -«■  E(nn')  are  given,  for  i  =  1,...I  , 

h  -  '--if  b*  x\v-\n  nV-cV  '  =  H)P+-s  (Im"1K 

where : 


m:C1    m=Cj    -     *'    1'— 1  ' 

m  dm  ' 


S  =   m:C. 
and  p  is  the  number  of  elements  in  S.  When  S  is  empty,  define  mn'cUm-"')  =  "• 
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That  is:  S  is  the  set  of  all  cross-sectional  dimensions  for  which  the 

subscripts  C.  m  and  C.  _  are  equal  in  the  variables  nC.  -,,... C.   and 
lmm      imm      *  ^1     lqq 

nC.  -.,... C.  ,  whose  covariance  is  an  element  of  V  .  Or:  S  is  the  set 
of  all  cross-sectional  dimensions  for  which  the  above  two  random  vari- 
ables correspond  to  the  same  category.  Note  that  the  set  S  depends  on 
the  two  elements  of  n  whose  covariance  is  being  considered. 

To  complete  the  analogy  to  the  case  q  =  3,  V  is  an  idempotent  matrix 
of  order  q 

k=l  K 

and  trace  (=rank)  q 

*  (Ik  -  1)  . 
k=l   K 


J.  THE  INAPPROPRIATELY  APPLIED  MODEL:  A  CASE  IN  WHICH  DISTURBANCE 
STRUCTURE  (A)  DOES  NOT  APPLY 

Before  proceeding  with  this  section,  it  may  be  instructive  to  amplify 
on  the  derivation  of  the  transformation  V.  Note  that  the  originally 
stated  purpose  of  the  transformation  V  was  to  rid  the  model  Y  =  Xb  +  Zft 
+  e  of  the  effects  of  certain  unobserved  or  unobservable  explanatory 
variables.  The  disturbance  structure  (A)  hypothesized  in  Part  D 
was  constructed,  more  or  less  artifically,  to  take  advantage  of  the  pro- 
perties of  V.  Disturbance  structure  (A)  is  simply  the  most  general  case 
of  the  original  problem:  it  contains  all  possible  sources  of  error  which 
the  transformation  V  is  able  to  remove.  Consider  a  model  of  the  form 
Y  =  X$  +  Zn  +  e  as  previously  introduced.  Then  the  following  statements 
are  equivalent: 

a.  e  obeys  disturbance  structure  (A): 

b.  The  elements  of  e  are  independent,  identically  distributed  Normal 

2 

random  variables,  each  with  mean  zero  and  variance  a   ,  and 

included  in  the  specification  of  the  model  (specifically,  in  Z) 
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is  any  variable  (observed  or  not)  which  may  be  written  as  vary- 
ing over  fewer  than  q  cross-sectional  dimensions  (q  is  the  total 
number  of  dimensions  involved  in  the  data). 
c.  No  knowledge  or  information  about  the  disturbance  term  e  may 
reasonably  be  assumed  except  that  at  least  one  component  of  each 
e...  is  a  sample  from  a  Normal  population  with  mean  zero  and 

I  J  L 

variance  a  . 
This  situation  suggests  two  useful  observations.  The  first  concerns 
the  unobserved  or  unobservable  explanatory  variables  which,  by  the 
dictates  of  theory  (that  is,  theory  relating  to  the  subject  being  modeled) 
or  other  considerations,  are  necessarily  included  in  some  model  of  the 
form  considered  here.'  Note  that,  since  the  transformation  V  rids  the 
model  of  these  variables  (as  long  as  each  of  these  variables  varies  over 
fewer  than  q  cross-sectional  dimensions,  where  q  is  the  total  number  of 
dimensions  involved)  in  any  case,  it  is  conceptually  and  practically 
equivalent  whether  these  variables  are  explicitly  included  in  the  formal 
form  of  the  model,  or  whether  they  are  implicitly  "thrown  into"  the 
disturbance  term.  This  is  a  trite  observation,  but  it  is  well  worth 
noting  for  the  following  reason:  some  studies  and  analyses  (see,  for 
example,  Nerlove  [8]),  when  implicitly  including  an  unobserved  or 
unobservable  explanatory  variable  as  a  component  of  the  disturbance  term, 

7 

make  a  strong  and  possibly  erroneous  assumption  in  order  to  complete  the 
regression  analysis  (that  is,  in  order  to  be  able  to  claim  an  unbiased 


The  term  "erroneous"  should  be  seen  in  context.  The  case  of  interest 
here  is  that  in  which  there  exists  some  unobserved  explanatory  vari- 
able which  is  expected  to  have  a  significant  effect  on  the  dependent 
variable.   In  addition,  it  is  supposed  that  the  analyst  has  no  (or 
does  not  care  to  get  any)  information  about  the  values  of  this 
variable.  Such  a  variable  may  indeed  not  even  by  quantifiable. 
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estimator  of  the  regression  coefficients)  without  using  some  transforma- 
tion such  as  V  to  purge  the  model  of  the  offending  variable.  Specifically, 

Q 

the  required  assumption  is  that  the  disturbance  term  (which  now  implicity 
includes  unobserved  and  unobservable  explanatory  variables)  has  known 
mean,  usually  zero.  [It  is  further  typically  assumed  that  the  disturbance 
term  is  Normally  distributed,  although  this  assumption  is  not  necessary 
if  all  one  wishes  to  do  is  ensure  that  the  estimator  is  unbiased.]  That 
this  assumption  may  be  erroneous  can  be  seen  in  two  approaches  to  the 
assumption.  One  may  simply  make  this  assumption  with  no  justification. 
But  since  theory,  or  other  consideration,  has  dictated  that  the  unobserved 
explanatory  variables  does  have  an  effect  on  the  dependent  variable,  the 
original  problem  still  remains.  And  the  resolution  to  that  problem  is 
still  to  remove  the  offending  explanatory  variable  (whether  explicitly 
included  in  the  model  or  implicitly  included  as  a  component  of  the 
disturbance  term)  by  some  transformation  such  as  V.   Alternatively,  one 
may  attempt  to  justify  the  assumption  by  means  of  some  device  such  as 
the  Central  Limit  Theorem,  in  this  case  making  the  additional  assumption 
that  the  components  of  the  disturbance  term,  which  now  includes  the  un- 
observed explanatory  variables,  are  independent.  Ignoring  for  the  moment 


o 

This  assumption  is  characterized  as  "required"  since  unless  it  is 
made,  some  unobserved  explanatory  variable  is,  in  effect,  still 
being  considered  an  explicit  term  in  the  model. 

9 

Note  that  V  may  not  be  unique  in  this  respect.  For  example,  in 
the  model 

yn  -  a  +  BXn  t  vZi  ♦  e.t  , 

where  one  wishes  to  purge  Z.,   the  transformation  W  may  be  used, 

Whefe:  W[yit]  -  [y.t  -  y.  L  W[XU]  =  [Xn  -  X.],  W[Z.]  = 

[Zi    -   Z.]   =  0,   W[en]   =    [e1t  -    £.],   W|>]   =   [a  -   a]  =  0 
Here  [Pit]  is  an  n- vector  whose  elements  are  Pit. 
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the  fact  that  this  latter  assumption  is  contrary  to  the  assumptions  of 
disturbance  structure  (A),  this  sort  of  argument  may  be  reasonable  in 
some  cases.  But  in  justifying  the  application  of  the  Central  Limit 
Theorem,  in  order  to  approximate  a  Normal  random  variable  of  known  mean 
by  a  sum  of  random  variables,  one  typically  assumes  that  the  disturbance 
term  represents  the  net  effect  of  numerous  individually  unimportant  but 
collectively  significant  variables.  But  this  is  clearly  not  the  case  (at 
least  this  latest  assumption  cannot  reasonably  be  made)  when  disturbance 
structure  (A)  pertains.  And,  more  generally,  it  can  be  said  that  there 
are  certainly  studies  of  interest  where  this  is  not  the  case:  the  un- 
observed explanatory  variable  whose  inclusion  in  the  model  was  a  necessity 
cannot  in  general  be  assumed  not  to  dominate  the  disturbance  term  in  which 
it  is  incorporated.  In  summary,  there  exist  studies  for  which  the  use  of 
a  transformation  such  as  V,  to  rid  the  model  of  undesired  variables,  is 
unavoidable  if  an  unbiased  estimator  of  the  regression  coefficients  is 
to  be  obtained.  Simply  discarding  an  undesired  variable  as  a  component 
of  a  disturbance  term  with  known  mean  should  be  viewed  cautiously.  As 
an  example,  in  the  reenlistment  model,  the  inclusion  of  the  terms  WC. 
and  C.  in  the  disturbance  term  can  be  expected  to  have  a  large  effect  on 
the  disturbance  term. 

The  second  observation  concerns  the  best  linear  unbiasedness  of  the 
estimator  B  =  (X'VX)"1X'VY  for  g  in  Y  =  Xg  +  Zfi  +  e.  Recall  that  when 
disturbance  structure  (A)  is  assumed,  B  is  the  best  linear  unbiased 
estimator  for  3.  Note  that  since,  in  disturbance  structure  (A),  the  ran- 
dom variables  a,  y,     6,  X,  u  and  tt  may  assume  any  (unknown)  distribu- 
tion, and  since  any  error  terms  in  the  model  (except  the  n,--t's)  may  be 
interdependent,  disturbance  structure  (A)  is  more  general  than  that 
typically  assumed  (specifically,  that  error  structure  in  which  the 
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elements  of  the  disturbance  term  e  are  independent,  identically  distributed 

p 
Normal  random  variables,  each  with  mean  zero  and  variance  a  ).  But  it  is 

not  a  generalization  of  this  latter  error  structure:  the  latter  is  not  a 

special  case  of  disturbance  structure  (A).  This  is  so  since  disturbance 

structure  (A)  is  based  on  a  certain  lack  of  specific  information  or 

knowledge  about  the  characteristics  of  the  components  of  the  disturbance 

term.  As  a  consequence,  if  the  error  structure  which  one  wishes  to  assume 

is  not  that  specified  by  disturbance  structure  (A),  then  B  =  (X'VX)"  X'VY 

is  not  necessarily  the  best  linear  unbiased  estimator  for  3  in  Y  =  X8  + 

Zfi  +  e. 

This  latest  observation  leads  into  the  proper  subject  of  this 

section:  a  consideration  of  a  common  case  in  which  B  is  not  the  best 

linear  unbiased  estimator  for  8.  For  consistency  of  approach,  suppose 

that  the  model  is  written  in  the  form  Y  =  X8  +  e,  where  any  unobserved 

or  unobservable  explanatory  variables  (if  any),  which  were  previously 

included  in  Z,  are  now  included  in  the  disturbance  term  e.  As  has  been 

seen,  B  =  (X'VX)~^X'VY  is  the  best  linear  unbiased  estimator  for  3  when 

e  obeys  disturbance  structure  (A).  Consider  the  asymptotic  properties 

of  the  matrix  V  in  three  cross-sectional  dimensions.  As  the  number  of 

categories,  I,  J,  and  T,  in  each  cross-sectional  dimension  goes  to 

infinity,  the  elements  of  V  behave  as  follows: 

1  -1  1-1  1  -i 


(I-1)(J-1)(T-1)  _  L  1 I    - I   -  1 

IJT       "1      1      1 


1-1  1-1 

-d-D(J-i)  ... L  ' J  J_  +0 

IJT "1      IT     U  ' 


1-1  1-1 

-d-D(T-i)  _  _ 1 1  j. 

IJT  1      1     J 
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0  , 


1    1  1    1 

-(J-D(T-l)      _ J_  _ T  _L 


IJT  111 


0  , 


!  .1 
IJT     ~1  J    T 

1  -1 
J-l   _  '   J   1    1 


IJT       1     I    T 


1  -1 

T-l      '   T   1    1 


IJT       1     I 


+  0  , 


0  , 


-  o  , 


-1 


IJT 


■>  0  . 


[Note  that  when  q  cross-sectional  dimensions  are  considered,  the  number 
of  unique  elements  in  V  is  2^,  since  each  element  of  V  depends  on  the 
comparison  of  the  subscripts  of  two  random  variables,  each  of  which  has 
q  subscripts.  These  two  random  variables  may  either  agree  or  disagree 
in  each  subscript.  For  q  =  3,  then,  V  has  2=8  unique  elements.] 
That  is,  the  diagonal  elements  of  V  approach  unity  and  all  other  elements 
of  V  approach  zero.  Or,  as  I, J,  and  T  increase  without  bound,  V  tends 
to  the  identity  matrix.  As  a  consequence,  (X'VX)"  X'VY  approaches 
(X'X)~  X'Y  as  I,  J  and  T  become  infinitely  large.  Hence,  in  the  case 
that  e  obeys  disturbance  structure  (A),  the  ordinary  least  squares 
estimator  §  =  (X'X)"  X'Y  is  in  the  limit  (in  I,  J  and  T)  an  unbiased 
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estimator  for  3,  since  it  is  the  limit  of  a  sequence  of  unbiased 

estimators.    This  suggests  that,  for  sufficiently  large  I,  J  and  T, 
the  ordinary  least  squares  estimator  for  3,3=  (X'X)~  X'Y  could  serve 
to  approximate  the  best  linear  unbiased  estimator  B  when  disturbance 
structure  (A)  holds.  This  line  of  thought  will  not  be  pursued:  it  is 
the  converse  suggestion,  that  B  can  serve  to  approximate  3  for  sufficiently 
large  I,  J  and  T,  that  is  more  interesting  here.  Suppose  that  the 
transformation  V  was  inappropriately  applied  to  the  model  Y  =  Xb  +  e. 
Specifically,  suppose  that  the  components  of  e  are  independent,  identic- 
ally  distributed  Normal  random  variables  with  mean  zero  and  variance  a  . 
Call  this  disturbance  structure  (B).  Then  the  ordinary  least  squares 
estimator  6  =  (X'X)~  X'Y  is  the  best  linear  unbiased  estimator  for  3. 
Note  that  B  =  (X'VX)"1X,VY  is  still  an  unbiased  estimator  for  6,  but  it 
is  no  longer  best.  But  since  V  approaches  the  identity  matrix  as  I,  J 
and  T  increase,  the  less  efficient  estimator  B  approaches  (X'X)~  X'Y 
as  well.  This  suggests  a  pragmatic  comparative  scheme  for  the  two 
estimators  B  and  3: 


In  treating  a  subject  related  to  that  considered  here,  Wallace  and 
Hussain  [9]  have  shown  the  asymptotic  equivalence  of  the  Aitken 
estimator  and  an  estimator  derived  under  a  linear  transformation 
(much  as  B  was  derived  from  the  linear  transformation  V)  for  a 
particular  error  structure.   In  the  disturbance  structure  considered 
in  their  paper,  the  disturbance  term  was  assumed  to  be  a  sum  of 
independent  random  variables  (in  a  combined  time  series  and  cross- 
section  analysis) , 

eit  =  ai  +  Yt  +  nit'  for  which  E(ai)  =  E(V  =  E^nit)  =  ° 

and  Var(a.j)  =  a-, ,  Var  (Yt)  =  o\>   Var(n.t)  =  a^   for  all  i,  t, 

2   2      2 
where  a-,,  c^ >  and  a~  were  known. 

The  paper  also  showed  the  equivalence  of  the  iterative  Aitken  esti- 
mator and  the  estimator  derived  under  a  linear  transformation  for 
the  disturbance  structure  as  above  with 

2   2      2 

a,,  Qp,  and  a_  unknown. 
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1.  Suppose  disturbance  structure  (A)  applies.  Then  B  is  biased, 
and  B  is  the  best  linear  unbiased  estimator  and  should  reasonably 
be  used. 

2.  Suppose  on  the  other  hand  that  disturbance  structure  (B)  is 
assumed  to  hold.  Then  3  and  B  are  both  unbiased  estimators, 
although  B  is  less  efficient  than  3.  But  note  that  B  has  an 
advantage  which  may  offset  (on  a  case-by-case  basis)  its  lesser 
efficiency:  it  guarantees  to  purge  all  random  variables  which 
are  invariant  over  at  least  one  cross-sectional  dimension.  That 
is,  if  one  is  unsure  of  the  validity  of  the  assumption  that  dis- 
turbance structure  (B)  holds,  then  one  may  see  some  value  in 
applying  the  transformation  V  in  order  to  rid  the  model  of  all 
such  possible  sources  of  error. 

Two  concluding  observations  should  now  be  made.  First,  it  is  clear 
that  application  of  the  transformation  V  is  equally  inappropriate  in  all 
other  cases  where  disturbance  structure  (A)  does  not  hold  in  the  model 
Y  =  X3  +  e.  An  important  special  case  is  that  in  which  the  generalized 
least  squares  estimator  for  3  is  appropriate.  Just  as  the  ordinary 
least  squares  estimator  3  =  (X'X)~  X'Y  is  the  best  linear  estimator  for 
3  when  E(e)  =  0  and  Var  (e)  =  a     I,  the  Aitken  estimator  3  =  (X1  n"  X)~ 
X'  q"  Y  is  the  best  linear  unbiased  estimator  for  3  for  the  case  in 
which  E(e)  =  0  and  Var(e)  =  o  Q. 

Finally,  it  is  worth  repeating  the  crucial  condition  which  underlies 
the  specification  of  the  case  in  which  the  transformation  V  is  effective. 
In  the  model  Y  =  X3  +  Zft  +  e (or  in  the  equivalent,  under  the  trans- 
formation V,  model  Y  =  X3  +  e,  where  the  variables  in  Z  are  thrown  into 
the  disturbance  term  e)  V  is  effective  in  removing  unobserved  or  unobserv- 
able  variables  (stochastic  or  deterministic)  only  if  these  variables 
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are  invariant  over  at  least  one  cross-sectional  dimension.  Accordingly, 
all  work  in  this  paper  is  performed  under  the  assumption  that  each 
variable  in  X  (those  variables  which  vary  over  all  cross-sectional 
dimensions)  has  been  observed. 

K.   INTERPRETATION  OF  TERMS  UNDER  THE  TRANSFORMATION  V 

Consider  the  model  in  the  form  Y  =  Xe  +  e,  in  three  cross-sectional 
dimensions.  The  equation  representing  the  data  in  the  i   category  of 
the  first  cross-sectional  dimension,  the  j   category  of  the  second 

J.L. 

dimension  and  the  t   category  of  the  third  dimension  is  y.  ..  =  x...  6  + 
eiit»  where  x. ..  is  a  k-vector  of  observations  on  the  k  explanatory 
variables  in  X.  The  categories  of  the  cross-sectional  dimensions  corres- 
ponding to  the  observations  y.  ..  and  x-  ..  may  be  considered  to  be 
"treatments"  which  affect  the  values  of  the  observations  of  y...  and 

x.  ..  in  the  (i,  j,  t)   "cell".  With  this  in  mind,  assume  that  each 
i  j  t 

y. ..  and  x...  can  be  represented  as  a  sum  of  common  mean,  effects  due  to 
'ijt     ijt        K 

single  treatments  (here  i,  j,  t  represent  the  "treatments"),  two-way 

interaction  effects  of  pairs  of  treatments,  and  a  three  way  interaction 

effect  of  the  three  treatments.  [Note  that  since  there  is  only  one 

observation  (on  each  of  y . . .  and  x...)  per  "cell",  it  is  generally  not 

possible  to  discern  between  the  effect  of  the  three-way  interaction  term 

and  the  error  term  e-.f.     In  this  case,  however,  it  is  known  that  a  three- 

way  interaction  term  does  in  fact  exist.  That  this  is  so  can  be  seen  as 

follows:  since  x...  is  deterministic,  one  can  calculate  the  exact  three- 

way  interaction  effect  for  cell  (i,  j,  t)  as  x...  -  x  ..  -  x.  .  -  x- .  + 

X.   +  x  .  +  X  .  -  X   ,  subject  only  to  roundoff  error  (this  express- 

ion  is  the  same  as  that  of  a  sample  estimate  of  the  three-way  interaction 

effect  for  the  case  of  stochastic  x..,).  This  is  not  identically  zero 

I  j  t 
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(for  all  cells),  by  previous  hypothesis  about  the  variables  in  X,  so  a 
three-way  interaction  effect  is  present.  And  since  y . ..  is  a  linear 
function  of  x...,  y.  ..  also  includes  a  three-way  interaction  effect.] 
That  is,  that: 

*m  =w  +  0ijt  +  Ai  +  Bj  +  ct  +  Dij  +  Eit  +  Fjt  +  eut 


-ijt     ■  *°  +  *ijt  +  A?  ♦  B».  +  C?  ♦  0°,   ♦  E?t  ♦  F°t, 


where  8...   and  $...   are  the  three-way  interaction  terms  mentioned  above, 


Substituting  these  into  the  model 


y...      =  u  +  0.  ..    +  A-   +  B.   +  C+  +  D-  •   +  E. .    +  F-.    +  e.  .. 
J1jt         M  ljt         l  J  t  ij  it         jt         ljt 


(,»  +  *. .,  +  A°  +  B°  +  C?  ♦  D»     ♦  E»t  ♦  F°     )     B     + 


ljt  ljt    P  ljt 


These  effects  can  be  equated  term  by  term  to  give 


and: 


Ai    ■ 

A°e 

Bo    ■ 

B°B 

ct    ■ 

c°e 

Dij  ■ 

DV 

Eit  ■ 

t> 

Fjt  ■ 

^ 

eijt  ■ 

•lit  B  (*> 
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Now  consider  the  data  under  the  transformation  V:  VY  =  V  X  3  +  Ve.  In 
the  (i,  j,  t)   cell  this  gives: 


^3fyij-   -yi.t-y.jt+^i..  ^.j.  +*..t-y... 


(xijt  -  Xij.  "  xi.t  "  X.jt  +  Xi..  +  X.j.  +  x..t  "  ".J  6 


+  (Ve)..+   ,     where  (Ve)...    is  the   (i,  j,  t)       element  of  Ve. 

Note  that  the  left  hand  side  of  this  equation  is  the  sample  estimate 

of  the  three-way  interaction  term  0.  •+,     And  the  term  in  parentheses  on 

the  right  hand  side  is  the  three-way  interaction  term  <^1-it.     This   is  the 

relationship  specified  in   (*)  above,  with  a  sample  estimate  for  $... 

1  j  t 

replacing  $...  and  with  a  disturbance  term  (Ve)...  included.  That  is, 
3  ljt  i jt 

under  the  assumption  that  y .  .,  and  x...  can  each  be  represented  as  a 
sun  of  common  mean,  effects  due  to  single  treatments,  two-way  interaction 
effects  of  pairs  of  treatments,  and  a  three-way  interaction  effect,  it  is 
true  that  0. ..  =  $..,  6.  Hence  B  can  be  estimated  by  regressing  the 
sample  estimate  of  the  three-way  interaction  term  0...  on  the  three-way 
interaction  term  $..,.  This  is  precisely  what  the  estimator  B  =  (X'VX)" 

ljt  K  J  V        / 

X'VY  accomplishes. 
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